The differences between unstructured and structured data
Structured data is information, usually text files,
displayed in titled columns and rows which can easily be ordered and processed
by data mining tools. This could be visualized as a perfectly organized filing
cabinet where everything is identified, labeled and easy to access.
Unstructured data is information that either does not have a
pre-defined data model and/or not organized in a predefined manner.
Common forms of unstructured data:
- Word Doc’s, PDF’s and Other Text Files - Books, letters, other written documents, audio and video transcripts
- Audio Files - Customer service recordings, voicemails, 911 phone calls
- Presentations - PowerPoints, SlideShares
- Videos - Police dash cam, personal video, YouTube uploads
- Images - Pictures, illustrations, memes
Data types:
+Identity data helps businesses to relate all other
information to a unique person, group, corporation, institution, digital asset
or otherwise
+Descriptive data includes all objective information that is
used to describe the identity
+Activity data is for actions
+Subjective data is about opinions offered by the identity
about other identities
+Relationship data refers to information about how
identities relate indirectly to other identities
Data warehouse “is a relational database that is designed
for query and analysis rather than for transaction processing. It usually
contains historical data derived from transaction data, but it can include data
from other sources. It separates analysis workload from transaction workload
and enables an organization to consolidate data from several sources.” Those
data are organized which is relevant and meaningful. The diagram shows data
warehouse’s architecture:
Data warehouse helps businesses to collect data and creates
big data. With the big data, businesses can build an analytics tool to extract
data, make data become more meaningful. Businesses then can make decision based
on these data
Benefit from data warehousing
1. Competitive advantage is gained by allowing
decision-makers access to data that can reveal previously unavailable, unknown,
and untapped information on, for example, customers, trends, and demands.
2. More cost-effective decision-making: Data warehousing
helps to reduce the overall cost of the· product· by reducing the number of
channels
3. Increased productivity of corporate decision-makers by
creating an integrated database of consistent, subject-oriented, historical
data
Limitations of data warehousing:
1. Extra Reporting Work
2. Cost/Benefit Ratio
3. Data Ownership Concerns
4. Complexity of integration
5. Underestimation of resources of data loading
6. Required data not captured
7. High maintenance
The role of data warehouses
Data warehouse will become more critical to future business
operations. With more than 95% unstructured data around the world, businesses
need warehouses to store and process those data into meaningful information.
This will enable businesses to forge ahead with unprecedented speed and agility.
List below is the top trends of data warehouse in year 2014
1. Hadoop optimizes data warehousing environments by
accelerating data transformation.
2. Customer experience (CX) strategies gain real-time
insight to improve marketing campaigns.
3. Engineered systems become the de facto standard for
large-scale information management activities.
4. On-demand sandbox analytics environments meet rising
demand for rapid prototyping and information discovery.
5. In-database analytics simplifies data-driven analysis
Source: