Exploring and visualizing data
During this phase or step in the overall data science pipeline process, the data scientist will use various methods to dig deeper into the data. Typically, several graphical representations are created (again, either manually or through a programming script or tool) emphasizing or validating a data scientist's observation, a particular point, or belief. This is a significant step in the overall data science process as the data scientist may come to understand that additional processing should be done on the data, or additional data needs to be collected, or perhaps the original theories appear to be validated. These findings will be cause for a pause, reflecting on the next steps that need to be taken. Should the data scientist proceed with the formal analysis process, perhaps creating a predictive model for automated learning? Or, should the scientist revisit a previous step, collecting additional (or different) data for processing?