-
A Database is a shared collection of logically related data and description of these data, designed to meet the information needs of an organization.
-
Data Storage: A database is used to store large amounts of structured data, making it easily accessible, searchable, and retrievable.
-
Data Analysis: A database can be used to perform complex data analysis, generate reports, and provide insights into the data.
-
Record Keeping: A database is often used to keep track of important records, such as financial transactions, customer information, and inventory levels.
-
Web Applications: Databases are an essential component of many web applications, providing dynamic content and user management.
- Integrity
- Availability
- Security
- Independent of Application
- Concurrency
-
Relational Databases
- Also known as SQL databases, these databases use a relational model to organize data into tables with rows and columns.
-
NoSQL Databases
- These databases are designed to handle large amounts of unstructured or semi-structured data, such as documents, images, or videos. (MongoDB)
-
Column Databases
- These databases store data in columns rather than rows, making them well-suited for data warehousing and analytical applications. (Amazon Redshift,Google BigQuery)
-
Graph Databases
- These databases are used to store and query graph-structured data, such as social network connections or systems. (Neo4j, Amazon Neptune)
-
Key-value databases
- These databases store data as a collection of keys and values, making them well-suited for caching and simple data storage needs (Redis and Amazon DynamoDB)
-
Also known as SQL databases, these databases use a relational model to organize data into tables with rows and columns.
-
table/relation
-
rows/records/tuples
-
columns/fields/attributes
-
cardinality of the relation/number of rows
-
degree of the relation/number of attributes
-
null
-
domain
- A database management system (DBMS) is a software system that provides the interfaces and tools needed to store, organize, and manage data in a database.
- A DBMS acts as an intermediary between the database and the applications or users that access the data stored in the database.
- users -> application -> dbms -> database -> os -> hardware
- Data Management - Store, retrieve and modify data
- Integrity - Maintain accuracy of data
- Concurrency - Simultaneous data access for multiple users
- Transaction - Modification to database must either be successful or must not happen at all
- Security - Access to authorized users only
- Utilities - Data import/export, user management, backup, logging
- A key in a database is an attribute or a set of attributes that uniquely identifies a tuple (row) in a table. Keys play a crucial role in ensuring the integrity and reliability of a database by enforcing unique constraints on the data and establishing relationships between tables.
-
Super Key
- A Super key is a combination of columns that uniquely identifies any row within a relational database management system (RDBMS) table
-
Candidate key
- A candidate key is a minimal Super key, meaning it has no redundant attributes.
- In other words, it's the smallest set of attributes that can be used to uniquely identify a tuple (row) in the table
-
Primary Key
- A primary key is a unique identifier for each tuple in a table. There can only be one primary key in a table, and it cannot contain null values.
-
Alternate Key
- An alternate key is a candidate key that is not used as the primary key.
-
Composite Key
- A composite key is a primary key that is made up of two or more attributes.
- Composite keys are used when a single attribute is not sufficient to uniquely identify a tuple in a table.
-
Surrogate Key
-
Foreign Key
- A foreign key is a set of attributes in a table that refers to the primary key of another table. The foreign key links these two tables.
-
Cardinality in database relationships refers to the number of occurrences of an entity in a relationship with another entity. Cardinality defines the number of instances of one entity that can be associated with a single instance of the related entity.
-
1:1, 1:N, M:N
Examples:
- Person -> Driving License Number
- Student -> college branch
- Restaurants -> orders
- Restaurants -> menu
- Students -> courses
- Complexity: Setting up and maintaining a database can be complex and time- consuming, especially for large and complex systems.
- Cost: The cost of setting up and maintaining a database, including hardware, software, and personnel, can be high.
- Scalability: As the amount of data stored in a database grows, it can become more difficult to manage, leading to performance and scalability issues.
- Data Integrity: Ensuring the accuracy and consistency of data stored in a database can be a challenge, especially when multiple users are updating the data simultaneously.
- Security: Securing a database from unauthorized access and protecting sensitive information can be difficult, especially with the increasing threat of cyber attacks.
- Data Migration: Moving data from one database to another or upgrading to a new database can be a complex and time-consuming process.
- Flexibility: The structure of a database is often rigid and inflexible, making it difficult to adapt to changing requirements or to accommodate new types of data.
- Data integrity in databases refers to the accuracy, completeness, and consistency of the data stored in a database.
- It is a measure of the reliability and trustworthiness of the data and ensures that the data in a database is protected from errors, corruption, or unauthorized changes.
- There are various methods used to ensure data integrity, including:
-
Constraints
- Constraints in databases are rules or conditions that must be met for data to be inserted, updated, or deleted in a database table.
- They are used to enforce the integrity of the data stored in a database and to prevent data from becoming inconsistent or corrupted.
-
Transactions
- A sequence of database operations that are treated as a single unit of work.
-
Normalization
- A design technique that minimizes data redundancy and ensures data consistency by organizing data into separate tables.
- Constraints in databases are rules or conditions that must be met for data to be inserted, updated, or deleted in a database table.
- They are used to enforce the integrity of the data stored in a database and to prevent data from becoming inconsistent or corrupted.
- NOT NULL
- UNIQUE
- PRIMARY KEY
- AUTO INCREMENT
- CHECK
- DEFAULT
- FOREIGN KEY
Referential Actions :
- RESTRICT
- CASCADE
- SET NULL
- SET DEFAULT
- SQL (Structured Query Language) is a programming language used for managing and manipulating data in relational databases.
- It allows you to insert, update, retrieve, and delete data in a database.
- It is widely used for data management in many applications, websites, and businesses. In simple terms, SQL is used to communicate with and control databases.
- DDL | Data Definition Language
- CREATE
- ALTER
- DROP
- TRUNCATE
- RENAME
- DML | Data Manipulation Language
- INSERT
- UPDATE
- DELETE
- DQL | Data Query Language
- SELECT
- DCL | Data Control Language
- GRANT
- REVOKE
- TCL | Transaction Control Language
- COMMIT
- ROLLBACK
- SAVEPOINT
- CREATE
- DROP
-- 1. CREATE DATABASE
CREATE DATABASE IF NOT EXISTS db;
-- 2. DROP DATABASE
DROP DATABASE IF EXISTS db;
- CREATE
- TRUNCATE
- DROP
- ALTER
- RENAME
-- 1. CREATE TABLE
CREATE TABLE department(
department_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(50) NOT NULL
)
CREATE TABLE employee(
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(50) NOT NULL,
email VARCHAR(50) UNIQUE,
hire_date DATE DEFAULT '2002-11-12',
salary DECIMAL(10,2) CHECK (salary >= 0),
department_id INT,
CONSTRAINT name_salary_unique UNIQUE(name, salary),
FOREIGN KEY (department_id) REFERENCES department (department_id)
ON DELETE RESTRICT
ON UPDATE CASCADE
)
INSERT INTO department (department_id, name) VALUES
(1, 'it'),
(2, 'hr');
INSERT INTO employee (id, name, email, hire_date, salary, department_id)
VALUES
(1, 'John Doe', 'jdoe@example.com', '2022-01-01', 50000, 1),
(2, 'Jane Smith', 'jsmith@example.com', '2022-02-15', 60000, 2),
(3, 'Bob Johnson', 'bjohnson@example.com', '2022-03-01', 55000, 1),
(4, 'Sarah Williams', 'swilliams@example.com', '2022-04-01', 65000, 2),
(5, 'Mike Davis', 'mdavis@example.com', '2022-05-15', 70000, 1);
-- 2. TRUNCATE TABLE
TRUNCATE TABLE employee
-- 3. DROP TABLE
DROP TABLE employee
- The "ALTER TABLE" statement in SQL is used to modify the structure of an existing table.
- Some of the things that can be done using the ALTER TABLE statement include
- Add columns
- Delete columns
- Modify columns
- Add Constraints
- Delete Constraints
-- 4. ALTER TABLE
-- Add two new column to an existing table:
ALTER TABLE table_name
ADD column_name data_type AFTER|BEFORE other_column_name,
ADD column_name data_type AFTER|BEFORE other_column_name;
-- Drop an existing column from a table:
ALTER TABLE table_name
DROP column_name;
-- Rename an existing column in a table:
ALTER TABLE table_name
CHANGE old_column_name new_column_name data_type;
-- Modify a datatype on column in a table:
ALTER TABLE table_name
MODIFY COLUMN column_name new_data_type;
-- Add Constraint
ALTER TABLE table_name
ADD CONSTRAINT name_of_constraint CHECK (column_name > 18)
-- Delete Constraint
ALTER TABLE table_name
DROP CONSTRAINT name_of_constraint
-- Add a primary key to a table:
ALTER TABLE table_name
ADD PRIMARY KEY (column_name);
-- Add a foreign key to a table:
ALTER TABLE table_name
ADD FOREIGN KEY (column_name)
REFERENCES referenced_table_name(referenced_column_name);
-- Drop a primary key or foreign key constraint from a table:
ALTER TABLE table_name
DROP PRIMARY KEY;
ALTER TABLE table_name
DROP FOREIGN KEY foreign_key_name;
-- 5. RENAME TABLE
RENAME TABLE old_name To new_name;
- The main three DML (Data Manipulation Language) commands in MySQL are:
- INSERT
- UPDATE
- DELETE
CREATE TABLE department(
department_id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(50) NOT NULL
)
CREATE TABLE employee(
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(50) NOT NULL,
email VARCHAR(50) UNIQUE,
hire_date DATE DEFAULT '2002-11-12',
salary DECIMAL(10,2) CHECK (salary >= 0),
department_id INT,
CONSTRAINT name_salary_unique UNIQUE(name, salary),
FOREIGN KEY (department_id) REFERENCES department (department_id)
ON DELETE RESTRICT
ON UPDATE CASCADE
)
INSERT INTO department (department_id, name) VALUES
(1, 'it'),
(2, 'hr');
INSERT INTO employee (id, name, email, hire_date, salary, department_id)
VALUES
(1, 'John Doe', 'jdoe@example.com', '2022-01-01', 50000, 1),
(2, 'Jane Smith', 'jsmith@example.com', '2022-02-15', 60000, 2),
(3, 'Bob Johnson', 'bjohnson@example.com', '2022-03-01', 55000, 1),
(4, 'Sarah Williams', 'swilliams@example.com', '2022-04-01', 65000, 2),
(5, 'Mike Davis', 'mdavis@example.com', '2022-05-15', 70000, 1);
UPDATE TB
SET c1 = 'JAVA', c2 = 'second'
WHERE c1 = 'C' AND c1 = 'C++;
DELETE FROM TB
WHERE c1 = 'C' AND c1 = 'C++;
SELECT ABS(c1) FROM TB;
SELECT COUNT(c1) FROM TB;
- ABS()
- ROUND()
- CEIL()
- FLOOR()
- MAX()
- MIN()
- SUM()
- AVG()
- COUNT()
- STD()
- VARIANCE()
SELECT * FROM TB LIMIT 10;
SELECT c1,c2 FROM TB;
SELECT DISTINCT(c1) FROM TB;
SELECT DISTINCT c1,c2 FROM TB;
SELECT c1 FROM TB
WHERE c2 IN ('APPLE','NOKIA');
SELECT c1 FROM TB
WHERE c2 NOT IN ('APPLE','NOKIA');
SELECT c1 FROM TB
WHERE c2 BETWEEN 10 AND 20;
SELECT c1,SQRT(c2),rating/10 AS star FROM TB;
SELECT c1, 'value' As new_tmp_col FROM TB;
SELECT model,screen_size FROM db.phones
WHERE brand_name = 'samsung'
ORDER BY screen_size DESC
LIMIT 3 ;
SELECT model, num_front_cameras + num_rear_cameras as 'Total' FROM db.phones
ORDER BY Total DESC
LIMIT 3;
SELECT model, battery_capacity FROM db.phones
ORDER BY battery_capacity DESC
LIMIT 1,1;
SELECT model,rating FROM db.phones
WHERE brand_name = 'apple'
ORDER BY rating ASC
LIMIT 3;
SELECT brand_name, price FROM db.phones
ORDER BY brand_name ASC, price DESC;
SELECT brand_name, COUNT(*) AS 'no_phones',
ROUND(AVG(price)) AS 'avg price',
MAX(rating) AS 'max rating',
ROUND(AVG(screen_size), 2) AS 'screen size',
ROUND(AVG(battery_capacity)) AS 'avg battery capacity'
FROM db.phones
GROUP BY brand_name
ORDER BY no_phones DESC;
SELECT has_5g,
ROUND(AVG(price)) AS 'avg price',
ROUND(AVG(rating)) AS 'avg rating'
FROM db.phones
GROUP BY has_5g
SELECT brand_name,
processor_brand,
COUNT(*) AS 'no_phones',
ROUND(AVG(primary_camera_rear)) AS 'res'
FROM db.phones
WHERE brand_name = 'samsung'
GROUP BY brand_name, processor_brand
SELECT brand_name,
ROUND(AVG(price)) AS 'avg_price'
FROM db.phones
GROUP BY brand_name
ORDER BY avg_price DESC LIMIT 3
SELECT brand_name, COUNT(*) AS 'no_phones'
FROM db.phones
WHERE has_nfc = 'True' AND has_ir_blaster = 'True'
GROUP BY brand_name
ORDER BY no_phones DESC LIMIT 3
SELECT has_5g,
ROUND(AVG(price)) AS 'avg_price'
FROM db.phones
WHERE brand_name = 'samsung'
GROUP BY has_5g;
SELECT brand_name,
COUNT(*) AS 'count',
ROUND(AVG(rating)) AS 'avg_rating'
FROM db.phones
GROUP BY brand_name
HAVING count > 80
ORDER BY avg_rating DESC
SELECT brand_name,
round((ram_capacity)) AS 'avg_ram'
FROM db.phones
WHERE refresh_rate > 90 AND fast_charging_available = 1
GROUP BY brand_name
HAVING COUNT(*) > 10
ORDER BY avg_ram DESC LIMIT 3;
SELECT brand_name,
COUNT(*) AS 'num_phones',
AVG(price) AS 'avg_price',
AVG(rating) AS 'avg_rating'
FROM db.phones
WHERE has_5g = 'True'
GROUP BY brand_name
having avg_rating > 70 AND num_phones > 10
ORDER BY avg_price DESC;
SELECT * FROM T1
CROSS JOIN T2;
SELECT * FROM T1
INNER JOIN T2
ON T1.id = T2.id;
SELECT * FROM T1
LEFT JOIN T2
ON T1.id = T2.id;
SELECT * FROM T1
RIGHT JOIN T2
ON T1.id = T2.id;
SELECT * FROM T1
LEFT JOIN T2
ON T1.id = T2.id
UNION
SELECT * FROM T1
RIGHT JOIN T2
ON T1.id = T2.id;
SELECT * FROM T1
INNER JOIN SELECT * FROM T1
ON T1.ID = T1.EID;
SELECT * FROM T1
UNION
SELECT * FROM T2;
- UNION
- UNION ALL
- INTERSECT
- EXCEPT
SELECT * FROM T1
INNER JOIN T2
ON T1.id = T2.id AND T1.eyr = T2.cyr;
SELECT * FROM T1
INNER JOIN T2
ON T1.id = T2.id
INNER JOIN T3
ON T2.uid = T3.uid;
- Based on Result
- Scalar
- Row
- Table
- Based on Working
- Independent
- Correlated
- INSERT
- UPDATE
- DELETE
- SELECT
- WHERE
- SELECT
- FROM
- HAVING
SELECT * FROM T1
WHERE C1 = (SELECT MAX(C1) FROM T1);
SELECT * FROM T1
WHERE C1 IN (SELECT C1 FROM T1
WHERE C2 > AVG(C2));
SELECT * FROM T1
WHERE (C1,C2) IN (SELECT C1,MAX(C2)
FROM T1
GROUP BY C1);
SELECT * FROM T1 M1
WHERE C1 > (SELECT AVG(C1) FROM T1 M2 WHERE M1.C2 = M1.C2)
SELECT C1,C2,C3,
(SELECT AVG(C3) FROM T1 M2 WHERE M2.C2 = M1.C2)
FROM T1 M1
SELECT C2, AVG(C3) FROM
(SELECT C1, AVG(C3) FROM T3) T1
JOIN T2
ON T1.C1=T2.C1
SELECT * FROM T1
GROUP BY C1
HAVING AVG(C2) > (SELECT AVG(C2) FROM T1)
INSERT INTO NT
(C1, C2)
SELECT C1, C2 FROM T1 WHERE C2 > (SELECT AVG(C2) FROM T1)
UPDATE NT
SET C3 = (
SELECT SUM(C2) FROM T1
WHERE T1.C1 = NT.C1
)
DELETE FROM T1
WHERE C1 IN (
{query|sq|ssq}
)