
Explore tens of thousands of sets crafted by our community.
Data Journalism Fundamentals
20
Flashcards
0/20




Open Source Intelligence (OSINT)




OSINT refers to the collection and analysis of information that is gathered from public, or open, sources. Journalists use this for investigative reporting, often in political or security contexts.




Scraping PDFs




Journalists sometime use tools to extract data from PDF files when information is not available in a more data-friendly format. This can be technically challenging but useful for analysis.




CSV Files




CSV (Comma Separated Values) files store tabular data in plain text form. They are a common format for exchanging data between databases and spreadsheet programs.




Data Mining




Data mining is the process of discovering patterns and knowledge from vast amounts of data. In journalism, it's used to analyze databases and find stories hidden within the numbers.




Interactive Maps




Interactive maps are used in journalism to display geospatial data, allowing readers to explore data by interacting with the map, such as zooming in/out or clicking on elements for more detail.




Data Ethics




Data ethics in journalism covers the principles of right and wrong that guide journalists in the practices of gathering, analyzing, and visualizing data, protecting sources, and respecting privacy.




Structured Query Language (SQL)




SQL is a standardized programming language used for managing relational databases and performing various data manipulation operations such as querying, updating, or deleting data.




Regression Analysis




Regression analysis is a statistical method that allows journalists to determine the strength and character of the relationship between one dependent variable and one or more independent variables.




Sentiment Analysis




Sentiment analysis is the process of computationally identifying and categorizing opinions expressed in text, especially on social media, to determine the writer's attitude.




Time Series Analysis




Time series analysis involves analyzing time-ordered sequence of data points to understand underlying trends, seasonal patterns or other structures in the data.




Freedom of Information Act




In the United States, the Freedom of Information Act (FOIA) allows for the full or partial disclosure of previously unreleased information and documents controlled by government agencies.




Data-Driven Journalism




Data-driven journalism is a journalistic process based on analyzing and filtering large data sets for the purpose of creating a news story or report.




Natural Language Processing (NLP)




NLP is a field of artificial intelligence that gives the machines the ability to read, understand, and derive meaning from human languages, useful in journalism for analyzing large amounts of text.




Data Visualization




Data visualization refers to the representation of data or information in a graph, chart, or other visual format. It helps to communicate data clearly and efficiently to readers.




Crowdsourcing




Crowdsourcing refers to obtaining information by enlisting the services of a large number of people, either paid or unpaid, typically via the Internet. Journalists use it for gathering data and stories.




Infographics




Infographics are a type of data visualization that present complex information quickly and clearly. These are often used in journalism to provide a high-level overview of a story.




APIs (Application Programming Interfaces)




APIs allow journalists to automatically gather data from a variety of sources, which can be used for real-time data reporting and visualization.




Web Scraping




Web scraping is a technique for extracting information from websites. Journalists use this to collect data from online sources when there is no API available.




Computer-Assisted Reporting (CAR)




CAR involves the use of computers to gather, analyze, and produce news stories. It includes things like data analysis, data visualization, and investigative reporting.




Data Cleaning




Data cleaning is the process of detecting and correcting (or removing) corrupt or inaccurate records from a record set or database. For journalists, this ensures data integrity before analysis.
© Hypatia.Tech. 2024 All rights reserved.