What are common Raw Data Formats & Data Sources for an Internet Business?

In context of internet businesses, raw data refers to the unprocessed, unfiltered information gathered from various customer activities and interactions with the business.




Characteristics of Raw Data:
Raw data is primarily characterized by its unprocessed state. It's often voluminous, collected from diverse interactions and operations, and can vary significantly in format and structure. This data is the direct output from data sources before any cleaning, structuring, or analysis has taken place.




Common Types of Raw Data in Internet Businesses:
Internet businesses collect several types of raw data, each serving different analytical purposes:


  • Event-Based Data: Derived from user interactions with a website or application, such as page visits, clicks, or app features usage. It's sequential and often timestamped, providing a granular view of user behavior.

  • Transactional Data: Comes from business transactions, including details of purchases, subscriptions, or any other customer transactions. It is critical for analyzing sales trends, customer value, and financial forecasting.

  • Customer Data: Involves all information related to customers' profiles and their interactions with the business. This includes demographic information, preferences, feedback, and support history.



Common Data Formats:
Raw data can be stored and processed in various formats, each suitable for different types of analysis and tools:


  • CSV/Excel: Widely used for structured data that fits well into tabular formats, like spreadsheets.

  • JSON/XML: Common formats for semi-structured data, often used in web services and APIs for data interchange.

  • Text Logs/Images/Videos: Examples of unstructured data that require specific processing techniques to extract meaningful information.



Typical Data Sources for Internet Businesses:
The data sources for an internet business are as varied as the business activities:


  • Internal Systems: Data generated from the company's own websites, applications, and customer relationship management systems, capturing everything from user behavior to operational efficiency.

  • External Platforms: Data coming from marketing platforms, social media, advertising networks, and third-party analytics services. This includes performance metrics of marketing campaigns, social engagement data, and broader market trends.

  • Partner Ecosystem: Includes data shared by logistics, payment processing, and other business partners. This data is crucial for understanding supply chain dynamics, payment trends, and customer fulfillment processes.



Takeaway:
In this concept, we covered what raw data is, types of raw data you'd come across in an internet business, and their common formats & sources.

We will refer to the different types & sources of raw data, while covering data transformation techniques later in this skilleton.