Welcome to the first blog post in a series talking about different tech technologies and concepts. This is my first time writing a blog so modifications, comments or any advice in general are highly appreciated.
A little bit about me, I am a Software Engineer with 1+ years of experience in Data Analysis, AI, Machine Learning & Back-End Development. As a kid, I have always been fascinated with computers, how they work and everything about tech really.
Now that introductions are out of the way and we have became acquaintances, Let's get on with it and KISS!
Don't worry not that type of Kiss, Kiss stands for
Keep It Simple, Stupid.
It is a design principle that states that designs and/or systems should be as simple as possible. Wherever possible, complexity should be avoided in a system—as simplicity guarantees the greatest levels of user acceptance and interaction and that is what we will try and do in our blogs.
So, There are a lot of Acronyms and Jargons that get thrown around a lot whenever you read an article/paper or any sort of tech related post, Like:
- Data Analysis
- Data Engineering
- Machine Learning
- Artificial Intelligence (AI)
- Deep Learning (DL)
and much much more...
All these words may seem like a lot but we will try to provide a brief introduction about them and how they integrate or differ from each other.
Data analysis is the process of cleaning, transforming, and modeling data to discover useful information that will help us make informed business decisions.
An example that we do in our daily lives is ordering takeout or delivery, the process of conducting a list of potential restaurants, choosing the meal that suits us from the various restaurants based on our hunger level/our budget & delivery time then actually making the decision to order that specific meal. We almost do data analysis every single day without even knowing or registering it as a chore.
The process consists of Five major steps that are done in the form of iterations:
- Identify What is it we are trying to solve or what is the obstacle currently facing us?
Collect the raw data that we will need to help answer the identified question. It can come from various sources whether it is:
- Ready-Made datasets
- Web Scraping for specific information
- Social media or Usage data from various websites
Clean the data. This often involves removing duplicates, inconsistencies & standardizing the format, and dealing with white spaces and other syntax errors.
Analyze the data. By manipulating the data using various data analysis techniques and tools, you can begin to find trends & correlations that begin to tell a story.
Interpret the results of your analysis to see how well you answered your original question and what recommendations can you make based on the data.
Don't worry if you don't understand any of the mentioned steps as we would dive deeper into them in upcoming blogs.
Data engineering is the practice of designing and building systems for the collection, storage, and analysis of huge amounts of data at scale.
Data engineers are required to create data pipelines to prepare data for Data Models and Analysts. Let's take an example of Uber and one of its competitors as DiDi. Data engineers are required to analyze various trips data including routes, spending rates and ratings to prepare the data for further analysis to take informed decisions by Analysts.
Roughly, the operations in a data pipeline consist of the following phases:
- Ingestion Gathering in the needed data
- Processing Processing the data to get the end results you want
- Storage Storing the end results for future fast retrieval
- Access Enabling end-users such as Analysts to access our data for further processing or manipulation
Machine Learning VS Artificial Intelligence VS Deep Learning
As the above picture indicates, the various definitions are intertwined and are all under the same umbrella that is Artificial Intelligence, Let's start with AI and what it is:
AI is basically the way by which we try to incorporate human intelligence into machines through sets of rules like algorithms. It is the broader area under which we try to make machines smarter and more human-like.
ML is a subset of AI that uses statistics to build smart systems. The created systems can learn, adapt and improve without explicitly being told how to based on the input data. It uses the data to detect patterns, make observations and make better future decisions. Your Netflix's or Spotify's recommendation systems are some of the examples for how ML is incorporated into our daily lives.
DL is a sub-set of ML that tries to mimic how our brain works by using Neurons, It is concerned with algorithms inspired by the structure and function of our brains and is designed to deal with much larger amounts of Data compared to ML.
In fear of making this blog too long and containing too much new information, It is time to say goodbye and see you in upcoming blog posts.
Hope you enjoyed reading it as much as I had fun writing it and please don't hesitate to contact me with any inquiries, modifications or suggestions for future posts.