SQLite JSON Processing

SQLiteBeginner
Practice Now

Introduction

In this lab, you will learn how to process JSON data within SQLite. You'll explore how to store, extract, filter, and update JSON data within an SQLite database. This lab provides a hands-on introduction to working with JSON data in SQLite, a skill increasingly valuable in modern data management.

Create a Database and Table

In this step, you will create an SQLite database and a table to store JSON data. SQLite is a lightweight database that stores data in a single file, making it easy to manage.

First, open your terminal. The default path is /home/labex/project.

Now, let's create a directory to store our database.

mkdir sqlite_json
cd sqlite_json

These commands create a directory named sqlite_json and then change the current directory to it. This will keep your project files organized.

Next, create an SQLite database named mydatabase.db.

sqlite3 mydatabase.db

This command opens the SQLite shell, connecting to the mydatabase.db database. If the database file doesn't exist, SQLite will create it.

Now, create a table named products with columns for id, name, and details. The details column will store JSON data as text.

CREATE TABLE products (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    name TEXT,
    details TEXT
);

This SQL command creates the products table:

  • id: A unique integer that automatically increments for each new product.
  • name: The name of the product (e.g., Laptop, Smartphone).
  • details: A text field to store the JSON data for the product.

Insert JSON Data

In this step, you will insert JSON data into the products table.

Let's insert two sample records into the products table.

INSERT INTO products (name, details) VALUES (
    'Laptop',
    '{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}'
);

INSERT INTO products (name, details) VALUES (
    'Smartphone',
    '{"brand": "Samsung", "model": "Galaxy S21", "specs": {"display": "6.2 inch", "camera": "12MP", "storage": "128GB"}}'
);

These INSERT statements add two rows to the products table. The details column contains JSON data as a text string.

To verify the data was inserted correctly, run the following query:

SELECT * FROM products;

Expected Output:

1|Laptop|{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}
2|Smartphone|{"brand": "Samsung", "model": "Galaxy S21", "specs": {"display": "6.2 inch", "camera": "12MP", "storage": "128GB"}}

This output confirms that the JSON data has been successfully stored in the products table.

Extract JSON Fields with a Custom Function

Since SQLite doesn't have built-in JSON functions, you'll create a custom Python function to extract data from the JSON strings.

First, exit the SQLite shell.

.exit

Now, create a Python file named json_extractor.py.

nano json_extractor.py

Paste the following Python code into the json_extractor.py file:

## Import the necessary libraries
import sqlite3
import json

## Define a function to extract a value from a JSON string using a path
def json_extract(json_str, path):
    try:
        ## Parse the JSON string into a Python dictionary
        json_data = json.loads(json_str)
        ## Split the path into components (e.g., 'specs.cpu' becomes ['specs', 'cpu'])
        path_components = path.split('.')
        ## Start with the full JSON object
        value = json_data
        ## Traverse the JSON object using the path components
        for component in path_components:
            ## Get the value for the current component
            value = value.get(component)
        ## Return the final value
        return value
    ## Handle errors if JSON is invalid or path does not exist
    except (json.JSONDecodeError, AttributeError, TypeError):
        return None

## Define a function to connect to the database and register the custom function
def connect_db(db_path):
    ## Connect to the SQLite database at the given path
    conn = sqlite3.connect(db_path)
    ## Register the 'json_extract' Python function as a custom SQL function
    ## "json_extract" is the name in SQL, 2 is the number of arguments, json_extract is the Python function
    conn.create_function("json_extract", 2, json_extract)
    ## Return the database connection
    return conn

## This block runs when the script is executed directly
if __name__ == '__main__':
    ## Connect to the database and register the function
    conn = connect_db('mydatabase.db')
    ## Create a cursor object to execute SQL queries
    cursor = conn.cursor()

    ## Use the custom SQL function to extract the 'brand' from the 'details' column
    cursor.execute("SELECT json_extract(details, 'brand') FROM products WHERE name = 'Laptop'")
    ## Fetch the result and print it
    print(cursor.fetchone()[0])

    ## Use the custom SQL function to extract the 'cpu' from the nested 'specs' object
    cursor.execute("SELECT json_extract(details, 'specs.cpu') FROM products WHERE name = 'Laptop'")
    ## Fetch the result and print it
    print(cursor.fetchone()[0])

    ## Close the database connection
    conn.close()

This Python code defines a function json_extract that takes a JSON string and a path as input and returns the value at that path. It also includes a connect_db function to connect to the SQLite database and register the json_extract function.

  • json.loads(json_str): This line parses the JSON string into a Python dictionary.
  • path.split('.'): This splits the path into a list of components. For example, 'specs.cpu' becomes ['specs', 'cpu'].
  • The loop iterates through the path components, accessing nested values in the JSON data.

Save the file and exit nano.

Now, run the Python script.

python3 json_extractor.py

Expected Output:

Dell
Intel i7

This script connects to the database, registers the json_extract function, and then uses it to extract the brand and CPU of the Laptop.

Filter Data Using JSON Queries

In this step, you'll use the custom json_extract function to filter data based on values within the JSON fields.

Open the json_extractor.py file again.

nano json_extractor.py

Modify the json_extractor.py file to include a function for querying the database:

## Import the necessary libraries
import sqlite3
import json

## Define a function to extract a value from a JSON string using a path
def json_extract(json_str, path):
    try:
        ## Parse the JSON string into a Python dictionary
        json_data = json.loads(json_str)
        ## Split the path into components (e.g., 'specs.cpu' becomes ['specs', 'cpu'])
        path_components = path.split('.')
        ## Start with the full JSON object
        value = json_data
        ## Traverse the JSON object using the path components
        for component in path_components:
            ## Get the value for the current component
            value = value.get(component)
        ## Return the final value
        return value
    ## Handle errors if JSON is invalid or path does not exist
    except (json.JSONDecodeError, AttributeError, TypeError):
        return None

## Define a function to connect to the database and register the custom function
def connect_db(db_path):
    ## Connect to the SQLite database at the given path
    conn = sqlite3.connect(db_path)
    ## Register the 'json_extract' Python function as a custom SQL function
    conn.create_function("json_extract", 2, json_extract)
    ## Return the database connection
    return conn

## Define a function to filter products based on a JSON field
def filter_products(db_path, json_path, value):
    ## Connect to the database
    conn = connect_db(db_path)
    ## Create a cursor object
    cursor = conn.cursor()
    ## Create the SQL query using an f-string to filter by a JSON value
    query = f"SELECT * FROM products WHERE json_extract(details, '{json_path}') = '{value}'"
    ## Execute the query
    cursor.execute(query)
    ## Fetch all matching results
    results = cursor.fetchall()
    ## Close the database connection
    conn.close()
    ## Return the results
    return results

## This block runs when the script is executed directly
if __name__ == '__main__':
    ## Example usage:
    ## Filter for products where the brand is 'Dell'
    dell_products = filter_products('mydatabase.db', 'brand', 'Dell')
    print("Products with brand 'Dell':", dell_products)

    ## Filter for products where the CPU is 'Intel i7'
    intel_products = filter_products('mydatabase.db', 'specs.cpu', 'Intel i7')
    print("Products with CPU 'Intel i7':", intel_products)

This code adds a filter_products function that takes a database path, a JSON path, and a value as input. It then connects to the database, registers the json_extract function, and executes a query to find all products where the value at the specified JSON path matches the given value.

Save the file and exit nano.

Now, run the Python script.

python3 json_extractor.py

Expected Output:

Products with brand 'Dell': [(1, 'Laptop', '{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}')]
Products with CPU 'Intel i7': [(1, 'Laptop', '{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}')]

This output shows the products that match the specified criteria.

Update JSON Values

In this step, you'll learn how to update values within the JSON fields.

Open the json_extractor.py file.

nano json_extractor.py

Modify the json_extractor.py file to include functions for updating the JSON and the database:

## Import the necessary libraries
import sqlite3
import json

## Define a function to extract a value from a JSON string using a path
def json_extract(json_str, path):
    try:
        ## Parse the JSON string into a Python dictionary
        json_data = json.loads(json_str)
        ## Split the path into components
        path_components = path.split('.')
        ## Start with the full JSON object
        value = json_data
        ## Traverse the JSON object
        for component in path_components:
            value = value.get(component)
        return value
    except (json.JSONDecodeError, AttributeError, TypeError):
        return None

## Define a function to connect to the database and register the custom function
def connect_db(db_path):
    ## Connect to the database
    conn = sqlite3.connect(db_path)
    ## Register the custom SQL function
    conn.create_function("json_extract", 2, json_extract)
    return conn

## Define a function to filter products based on a JSON field
def filter_products(db_path, json_path, value):
    ## Connect to the database
    conn = connect_db(db_path)
    ## Create a cursor object
    cursor = conn.cursor()
    ## Create the SQL query to filter by a JSON value
    query = f"SELECT * FROM products WHERE json_extract(details, '{json_path}') = '{value}'"
    ## Execute the query
    cursor.execute(query)
    ## Fetch all matching results
    results = cursor.fetchall()
    ## Close the connection
    conn.close()
    return results

## Define a function to update a value within a JSON string
def update_json_value(json_str, path, new_value):
    try:
        ## Parse the JSON string into a Python dictionary
        json_data = json.loads(json_str)
        ## Split the path into components
        path_components = path.split('.')
        ## Navigate to the parent of the target value
        target = json_data
        for i in range(len(path_components) - 1):
            target = target.get(path_components[i])

        ## Update the target value
        target[path_components[-1]] = new_value
        ## Convert the Python dictionary back to a JSON string
        return json.dumps(json_data)
    except (json.JSONDecodeError, AttributeError, TypeError):
        ## If an error occurs, return the original JSON string
        return json_str

## Define a function to update a product's details in the database
def update_product_details(db_path, product_name, json_path, new_value):
    ## Connect to the database
    conn = sqlite3.connect(db_path)
    ## Create a cursor object
    cursor = conn.cursor()

    ## Get the current details for the specified product
    ## The '?' is a placeholder to prevent SQL injection
    cursor.execute("SELECT details FROM products WHERE name = ?", (product_name,))
    result = cursor.fetchone()
    ## If the product doesn't exist, close connection and return False
    if not result:
        conn.close()
        return False

    ## Get the current JSON details string
    current_details = result[0]

    ## Update the JSON string with the new value
    updated_details = update_json_value(current_details, json_path, new_value)

    ## Update the database with the new JSON string
    cursor.execute("UPDATE products SET details = ? WHERE name = ?", (updated_details, product_name))
    ## Commit the changes to the database
    conn.commit()
    ## Close the connection
    conn.close()
    return True

## This block runs when the script is executed directly
if __name__ == '__main__':
    ## Example usage: Update the laptop's memory to 32GB
    update_product_details('mydatabase.db', 'Laptop', 'specs.memory', '32GB')
    print("Laptop memory updated to 32GB")

    ## Verify the update by fetching the data again
    conn = sqlite3.connect('mydatabase.db')
    cursor = conn.cursor()
    cursor.execute("SELECT details FROM products WHERE name = 'Laptop'")
    ## Fetch the updated details
    updated_details = cursor.fetchone()[0]
    print("Updated Laptop details:", updated_details)
    ## Close the connection
    conn.close()

This code adds two functions:

  • update_json_value: This function takes a JSON string, a path, and a new value as input. It parses the JSON string, updates the value at the specified path, and returns the updated JSON string.
  • update_product_details: This function takes a database path, a product name, a JSON path, and a new value as input. It connects to the database, retrieves the current JSON data for the product, updates the value at the specified path using update_json_value, and then updates the database with the modified JSON data.

Save the file and exit nano.

Now, run the Python script.

python3 json_extractor.py

Expected Output:

Laptop memory updated to 32GB
Updated Laptop details: {"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "32GB", "storage": "512GB SSD"}}

This output confirms that the Laptop's memory has been updated to 32GB in the database.

Summary

In this lab, you have learned how to process JSON data within SQLite. You started by creating a database and a table to store JSON data. Then, you learned how to insert JSON data into the table. You created a custom Python function to extract specific fields from the JSON data and used this function to filter data based on values within the JSON fields. Finally, you learned how to update values within the JSON fields using a custom Python function. These skills provide a foundation for managing JSON data effectively within SQLite databases.