Sign in to unlock

Get unlimited access to all questions and answers

checked

Get lifetime access

checked

Track progress

checked

Save time

23 Databases interview questions and answers for software engineers

A database is an organized collection of data, generally stored and accessed electronically from a computer system. Where databases are more complex they are often developed using formal design and modeling techniques. Here is a list of coding interview questions on Databases to help you get ready for your next data structures interview in 2021.

  • 1.

    What is Normalization?

    Answer:

    Normalization is the systematic process of reducing data redundancy, eliminating data anomalies, and enhancing data integrity in a database. The goal is to minimize data duplication and ensure that every piece of data is stored in the most efficient and logical manner.

    Why is Normalization Important?

    Normalization is crucial for several reasons:

    1. Data Integrity: By eliminating redundant data, the risk of data becoming inconsistent is greatly reduced.
    2. Efficiency: Normalized databases are more space-efficient and require fewer resources to maintain.
    3. Flexibility: A normalized database is more flexible and adaptable to future changes in business requirements.
    4. Quality Assurance: Data anomalies and inconsistencies can lead to incorrect analytical results or business decisions. Normalization helps in avoiding such issues.

    Normalization Levels

    There are different levels of normalization that a database can achieve, known as normal forms. The most commonly discussed normal forms are:

    1. First Normal Form (1NF): Ensures atomicity, i.e., each data field contains only indivisible values.
    2. Second Normal Form (2NF): Removes partial dependencies, where a non-key attribute is functionally dependent on only part of the primary key.
    3. Third Normal Form (3NF): Eliminates transitive dependencies, where a non-key attribute depends on another non-key attribute.
    4. Boyce-Codd Normal Form (BCNF): A stricter version of 3NF designed to handle a special type of anomaly called a functional dependency.
    5. Fourth Normal Form (4NF): Deals with multi-valued dependencies.
    6. Fifth Normal Form (5NF): Addresses join dependencies, where a database table can be further decomposed if it contains sets of attributes that can be logically grouped.

    Demonstration

    Let's take a simple database table with a few records to demonstrate normalization:

    Unnormalized Form (UNF)

    Order ID Customer Name Product Name Category
    1 John Doe iPhone 12 Electronics
    2 Jane Smith Apple Watch Electronics
    3 Sarah Thompson Nike Shoes Apparel
    4 John Doe Nike T-shirt Apparel

    In this UNF table, both Customer Name and Product Name are duplicated.

    First Normal Form (1NF)

    The table is already in 1NF because it has a primary key, and each field contains atomic values.

    Second Normal Form (2NF)

    To achieve 2NF, we need to remove the partial dependencies. In this case, the Category is partially dependent on the Product Name, which is a part of the primary key. We can break it into a separate table:

    Table 1: Orders

    Order ID Customer Name Product ID
    1 John Doe 1
    2 Jane Smith 2
    3 Sarah Thompson 3
    4 John Doe 4

    Table 2: Products

    Product ID Product Name Category
    1 iPhone 12 Electronics
    2 Apple Watch Electronics
    3 Nike Shoes Apparel
    4 Nike T-shirt Apparel

    Third Normal Form (3NF)

    In 3NF, we identify and remove transitive dependencies. In our example, there is no such dependency.

    Boyce-Codd Normal Form (BCNF)

    This form deals with functional dependencies and is stricter than 3NF. In our example, the table is already in BCNF.

    Final Thoughts

    Normalization is not a one-time process, and databases should be periodically checked for normalization. Moreover, depending on the context, going beyond 3NF may not always be necessary and could even lead to performance issues.

  • 2.

    What are the advantages of NoSQL over traditional RDBMS?

    Answer:

    The decision to use either NoSQL or RDBMS for a particular application depends on a multitude of factors. Here are some of the key advantages that NoSQL databases have over traditional RDBMS systems while also discussing the areas where RDBMS might still be better suited.

    Advantages of NoSQL

    Flexible Schema

    One of the main advantages of NoSQL is its ability to handle unstructured and semi-structured data. Unlike RDBMS, NoSQL databases do not require a predefined schema, offering more flexibility for data storage and retrieval.

    High Performance and Scalability

    NoSQL databases often excel in terms of read/write and horizontal scalability. They can easily handle large volumes of data, making them a good fit for Big Data and high-throughput applications.

    Variety of Data Models

    NoSQL databases come in different flavors, each catering to specific data storage and retrieval needs. The four main types are:

    • Key-Value Stores: Provide a lookup service based on keys.
    • Document Stores: Store, query, and retrieve document-oriented information.
    • Column-Family Stores: Optimize for large datasets with transactional integrity.
    • Graph Databases: Designed to manage and query connections between entities.

    Cost-Effectiveness

    NoSQL databases can be run on commodity hardware, which is often more affordable compared to specialized storage solutions required for RDBMS.

    Quick Iteration and Development

    The lack of a rigid schema and other features designed for flexibility often makes NoSQL databases easier to work with, leading to faster development cycles.

    Example: Document Database with Python

    Here's a quick example using MongoDB, a popular NoSQL document database, with Python:

    from pymongo import MongoClient
    
    # Connect to MongoDB
    client = MongoClient("mongodb://localhost:27017/")
    db = client["mydatabase"]
    
    # Insert a document
    mydoc = {"name": "John", "age": 30}
    db["customers"].insert_one(mydoc)
    
    # Query documents
    for x in db["customers"].find():
      print(x)
    

    Advantages of RDBMS

    While NoSQL solutions offer many benefits, there are areas where RDBMS systems still hold the advantage.

    Data Integrity and ACID Compliance

    RDBMS are known for their strong data integrity through features like ACID (Atomicity, Consistency, Isolation, Durability) properties. This makes RDBMS a natural choice for applications that deal with financial transactions and other sensitive types of data.

    Complex Query Support

    If your application relies heavily on complex queries that require joins and group by operations, RDBMS might be a better fit.

    Mature Ecosystem and Tooling

    RDBMS solutions have a long-established ecosystem, including query optimization tools, support for business intelligence (BI), and analytics, making them a reliable choice for many types of applications.

    Example: Relational Database with Python

    Here's a simple example using SQLite, a lightweight RDBMS, with Python:

    import sqlite3
    
    # Connect to SQLite database
    conn = sqlite3.connect('example.db')
    c = conn.cursor()
    
    # Create a table
    c.execute('''CREATE TABLE stocks
                 (date text, trans text, symbol text, qty real, price real)''')
    
    # Insert a row of data
    c.execute("INSERT INTO stocks VALUES ('2022-01-01','BUY','RHAT',100,35.14)")
    
    # Query the table
    for row in c.execute('SELECT * FROM stocks ORDER BY price'):
        print(row)
    
    # Save the changes
    conn.commit()
    
    # Close the connection
    conn.close()
    

    Conclusion

    While NoSQL databases offer flexibility and scalability, they might not be suitable for use cases that require ACID compliance, strong schema, and complex relational queries. The choice between NoSQL and RDBMS ultimately depends on your application's specific requirements.

  • 3.

    What is the difference between Data Definition Language (DDL) and Data Manipulation Language (DML)?

    Answer:

    In the world of databases, two SQL command categories are particularly important to grasp: Data Definition Language (DDL) and Data Manipulation Language (DML).

    Core Concepts

    Data Definition Language (DDL)

    • Objective: Defines the structure of database schemas and objects.
    • Key Commands: CREATE, ALTER, DROP, TRUNCATE, RENAME.

    Data Manipulation Language (DML)

    • Objective: Manipulates the data within the database objects.
    • Key Commands: SELECT, INSERT, UPDATE, DELETE, MERGE.

    In-Depth Explanation

    Data Definition Language (DDL)

    DDL is used to define, modify, and delete structures in the database. These structures include tables, views, indexes, and more. DDL statements generally auto-commit, meaning the changes are effective immediately and cannot be rolled back.

    The commonly used DDL commands are:

    • CREATE: Creates new database objects like tables, views, and indexes.

      CREATE TABLE Customers (ID int, Name varchar(255));
      
    • ALTER: Modifies existing database objects, like tables or columns.

      ALTER TABLE Customers ADD Age int;
      
    • DROP: Removes database objects.

      DROP TABLE Customers;
      
    • TRUNCATE: Removes all rows from a table, but keeps the table structure.

      TRUNCATE TABLE Customers;
      
    • RENAME: Renames a table or column.

      ALTER TABLE Customers RENAME TO Clients;
      

    Data Manipulation Language (DML)

    DML is used to retrieve, manipulate, and delete data in existing database objects. DML commands generally do not auto-commit, meaning you can use transaction management to roll back the changes.

    The commonly used DML commands are:

    • SELECT: Retrieves data from one or more tables.

      SELECT * FROM Customers;
      
    • INSERT: Adds new rows of data into a table.

      INSERT INTO Customers (ID, Name) VALUES (1, 'Alice');
      
    • UPDATE: Modifies existing data in a table.

      UPDATE Customers SET Name = 'Robert' WHERE ID = 1;
      
    • DELETE: Removes rows from a table.

      DELETE FROM Customers WHERE ID = 1;
      
    • MERGE: Performs an upsert operation (combines INSERT and UPDATE).

      MERGE INTO Customers AS TARGET USING NewCustomers AS SOURCE
      ON TARGET.ID = SOURCE.ID
      WHEN MATCHED THEN UPDATE SET TARGET.Name = SOURCE.Name
      WHEN NOT MATCHED THEN INSERT (ID, Name) VALUES (SOURCE.ID, SOURCE.Name);
      

    Real-World Examples

    To better understand the practical applications of DDL and DML, consider the following scenarios:

    1. Setting Up a New Table (DDL):

      • DDL Command: CREATE TABLE Employees (ID int, Name varchar(255));
      • Objective: Defines a new table Employees with columns ID and Name.
    2. Adding a New Employee Record (DML):

      • DML Command: INSERT INTO Employees (ID, Name) VALUES (1, 'John');
      • Objective: Inserts a new row with ID=1 and Name='John' into the Employees table.
    3. Changing the Employee Name (DML):

      • DML Command: UPDATE Employees SET Name = 'Jenny' WHERE ID = 1;
      • Objective: Updates the Name column to 'Jenny' where ID=1 in the Employees table.
    4. Dropping the Employees Table (DDL):

      • DDL Command: DROP TABLE Employees;
      • Objective: Deletes the entire Employees table and its schema.
One tip that got me hired by Google, Microsoft, and Stripe

I've worked for Microsoft, Google, Stripe, and received offers from many other companies. One thing I learned when I was interviewing myself is that standard interview tips are woefully inadequate.

Reverse Tech Interview: Questions to Stump an Interviewer

Few people want to get into an uncomfortable environment or an unprofitable company with no prospects. If you’re wondering how to get a real feel for a company during an interview, you’re welcome. I’ll give a list of questions that interviewers don’t usually expect...

How to become a programmer, move to the US, and get a dream job

Are you curious about what you need to do to earn more than $15.000 a month, drive a Tesla, live in sunny California, and work at Google, Apple, Facebook, or Amazon?

11 Reactive Systems interview questions and answers for software engineers

Reactive systems are an architectural style that allows multiple individual applications to blend into one unit, reacting to their environment, while staying aware of each other. Here is a list of coding interview questions on Reactive Systems to help you get ready for your next data struc...

. Reactive Systems
34 Microservices interview questions and answers for software engineers

Microservice architecture – a variant of the service-oriented architecture structural style – arranges an application as a collection of loosely coupled services. In a microservices architecture, services are fine-grained and the protocols are lightweight. Here is a list of coding intervie...

. Microservices
7 Layering & Middleware interview questions for developers

Middleware in the context of distributed applications is software that provides services beyond those provided by the operating system to enable the various components of a distributed system to communicate and manage data. Middleware supports and simplifies complex distributed application...

. Layering & Middleware
55 Docker interview questions and answers for software engineers

Docker is a set of platform as a service products that use OS-level virtualization to deliver software in packages called containers. Here is a list of coding interview questions on Docker to help you get ready for your next data structures interview in 2021.

. Docker
23 Databases interview questions and answers for software engineers

A database is an organized collection of data, generally stored and accessed electronically from a computer system. Where databases are more complex they are often developed using formal design and modeling techniques. Here is a list of coding interview questions on Databases to help you g...

. Databases
21 Concurrency interview questions and answers for software engineers

In computer science, concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order, without affecting the final outcome. Here is a list of coding interview questions on Concurrency to help you get ready for your n...

. Concurrency
13 CAP Theorem interview questions and answers for software engineers

In theoretical computer science, the CAP theorem, also named Brewer's theorem after computer scientist Eric Brewer, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees: Consistency: Every read receives the ...

. CAP Theorem
Load more posts

Features

folder icon

Access all answers

Get the inside track on what to expect in your next tech interview. We offer you a collection of high quality technical interview questions to help you prepare for your next coding interview.

graph icon

Track progress

A smart interface helps you track your progress and focus on the topics you need. You can search through questions or sort them by difficulty or type.

clock icon

Save time

Save tens of hours searching for information on hundreds of low-quality sites designed to drive traffic and make money from advertising.

Land a six-figure job at one of the top companies.

amazon logo facebook logo google logo microsoft logo uber logo
Prepare for a technical interview

Get hired with our
tech interview questions & answers