Filter Data Using JSON Queries
In this step, you'll use the custom json_extract function to filter data based on values within the JSON fields.
Open the json_extractor.py file again.
nano json_extractor.py
Modify the json_extractor.py file to include a function for querying the database:
## Import the necessary libraries
import sqlite3
import json
## Define a function to extract a value from a JSON string using a path
def json_extract(json_str, path):
try:
## Parse the JSON string into a Python dictionary
json_data = json.loads(json_str)
## Split the path into components (e.g., 'specs.cpu' becomes ['specs', 'cpu'])
path_components = path.split('.')
## Start with the full JSON object
value = json_data
## Traverse the JSON object using the path components
for component in path_components:
## Get the value for the current component
value = value.get(component)
## Return the final value
return value
## Handle errors if JSON is invalid or path does not exist
except (json.JSONDecodeError, AttributeError, TypeError):
return None
## Define a function to connect to the database and register the custom function
def connect_db(db_path):
## Connect to the SQLite database at the given path
conn = sqlite3.connect(db_path)
## Register the 'json_extract' Python function as a custom SQL function
conn.create_function("json_extract", 2, json_extract)
## Return the database connection
return conn
## Define a function to filter products based on a JSON field
def filter_products(db_path, json_path, value):
## Connect to the database
conn = connect_db(db_path)
## Create a cursor object
cursor = conn.cursor()
## Create the SQL query using an f-string to filter by a JSON value
query = f"SELECT * FROM products WHERE json_extract(details, '{json_path}') = '{value}'"
## Execute the query
cursor.execute(query)
## Fetch all matching results
results = cursor.fetchall()
## Close the database connection
conn.close()
## Return the results
return results
## This block runs when the script is executed directly
if __name__ == '__main__':
## Example usage:
## Filter for products where the brand is 'Dell'
dell_products = filter_products('mydatabase.db', 'brand', 'Dell')
print("Products with brand 'Dell':", dell_products)
## Filter for products where the CPU is 'Intel i7'
intel_products = filter_products('mydatabase.db', 'specs.cpu', 'Intel i7')
print("Products with CPU 'Intel i7':", intel_products)
This code adds a filter_products function that takes a database path, a JSON path, and a value as input. It then connects to the database, registers the json_extract function, and executes a query to find all products where the value at the specified JSON path matches the given value.
Save the file and exit nano.
Now, run the Python script.
python3 json_extractor.py
Expected Output:
Products with brand 'Dell': [(1, 'Laptop', '{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}')]
Products with CPU 'Intel i7': [(1, 'Laptop', '{"brand": "Dell", "model": "XPS 13", "specs": {"cpu": "Intel i7", "memory": "16GB", "storage": "512GB SSD"}}')]
This output shows the products that match the specified criteria.