How I Cut SQL Query Time from 45 Seconds to 8 Seconds on a 500+ SKU Inventory Database

Daniel Chukwuebuka Nnadi Senior Data Analytics Specialist · Defem Autocare Ltd

Published for Towards Data Science / Hashnode · 2026

When I joined the data team at an automotive parts company, one of the first things I noticed was that the inventory query everyone relied on - the one that pulled stock levels, reorder flags, and supplier status across 500+ SKUs - took 45 seconds to run. Every time. It was so slow that analysts had built workarounds: pulling data in the morning and working from yesterday's snapshot. Real-time decision-making was effectively impossible.

This is the story of how I diagnosed that problem, redesigned the indexing strategy and schema, and brought that query down to 8 seconds - without touching a single line of application code.

45s Before (query time)

8s After (query time)

82% Performance gain

The Problem: What Was Actually Happening

The database was a SQL Server instance with a central inventory table, a suppliers table, a purchase_orders table, and a stock_movements table. The query joining all four was doing a full table scan on every execution - I could see this immediately in the execution plan.

There were a few compounding issues:

No composite indexes - The inventory table had a primary key index on sku_id, but the query was filtering heavily on warehouse_id and category_code together. Neither had an index. Every filter was a full scan.
Implicit type conversions - The join between purchase_orders and inventory was comparing a VARCHAR column to an INT column. SQL Server was silently casting on every row comparison. This alone added about 8 seconds.
Redundant subqueries in SELECT - A correlated subquery was running inside the SELECT clause to calculate last delivery date per SKU. It was executing once per row - 500+ times per query run.
No covering index for the reporting columns - Even with better filtering, the engine was going back to the base table for every column not in the index, causing excessive key lookups.

"The most expensive queries are often not the complex ones - they're the simple ones running against a schema that was never designed for the way the data is actually queried."

Step 1: Read the Execution Plan Properly

Before touching anything, I ran the query with SET STATISTICS IO ON and SET STATISTICS TIME ON and captured the execution plan. This is non-negotiable - if you're tuning SQL without looking at the execution plan, you're guessing.

SQL Server · Diagnostic

-- Step 1: Enable I/O and time statistics
SET STATISTICS IO ON;
SET STATISTICS TIME ON;

-- Step 2: Run the slow query and capture the plan
SELECT *
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) AS st
CROSS APPLY sys.dm_exec_query_plan(qs.plan_handle) AS qp
WHERE st.text LIKE '%inventory%'
ORDER BY qs.total_elapsed_time DESC;

The execution plan showed two table scans with 98% cost share, and the correlated subquery was visible as a "nested loops" operator running 512 times. That was the first thing to eliminate.

Step 2: Eliminate the Correlated Subquery

The original query looked roughly like this - the correlated subquery was the biggest offender:

SQL · Before (Slow)

-- BEFORE: Correlated subquery runs once per SKU row
SELECT
    i.sku_id,
    i.product_name,
    i.qty_on_hand,
    (
        SELECT MAX(po.delivery_date)
        FROM purchase_orders po
        WHERE po.sku_id = i.sku_id     -- runs 500+ times
    ) AS last_delivery_date
FROM inventory i
WHERE i.warehouse_id = 3
  AND i.category_code = 'BRAKE';

SQL · After (Fast)

-- AFTER: Single aggregation join - runs once total
WITH latest_deliveries AS (
    SELECT
        sku_id,
        MAX(delivery_date) AS last_delivery_date
    FROM purchase_orders
    GROUP BY sku_id
)
SELECT
    i.sku_id,
    i.product_name,
    i.qty_on_hand,
    ld.last_delivery_date
FROM inventory i
LEFT JOIN latest_deliveries ld
    ON i.sku_id = ld.sku_id
WHERE i.warehouse_id = 3
  AND i.category_code = 'BRAKE';

This change alone dropped execution time from 45 seconds to around 28 seconds. Progress - but still not good enough.

Step 3: Fix the Type Mismatch

The implicit conversion between VARCHAR and INT on the join condition was invisible in the query itself but devastating in the execution plan. The fix was to align the column types at the schema level and update the foreign key constraint.

Key Lesson

Implicit type conversions prevent SQL Server from using indexes on the converted column. Always verify that joined columns share identical data types - check with INFORMATION_SCHEMA.COLUMNS before writing joins on any table you didn't design yourself.

Step 4: Redesign the Indexes

This was the highest-leverage change. I created a composite index on the two columns most commonly used together in WHERE clauses, and added the reporting columns as INCLUDEd columns to create a covering index - meaning the engine never needs to go back to the base table.

SQL · Index Redesign

-- Drop the old single-column index on warehouse_id
DROP INDEX IF EXISTS IX_inventory_warehouse
ON inventory;

-- Create composite index on filter columns
-- INCLUDE covers the SELECT columns - no key lookups
CREATE NONCLUSTERED INDEX IX_inventory_warehouse_category
ON inventory (warehouse_id, category_code)
INCLUDE (sku_id, product_name, qty_on_hand, reorder_threshold);

-- Index on purchase_orders for the CTE join
CREATE NONCLUSTERED INDEX IX_po_sku_delivery
ON purchase_orders (sku_id)
INCLUDE (delivery_date);

· · ·

The Final Result

Before

Query time 45 seconds
Logical reads approximately 84,000
Subquery executions 512 times
Index seeks 0 (full scans)
Type conversions Per row

After

Query time 8 seconds
Logical reads approximately 4,200
Subquery executions 0 (CTE)
Index seeks Covering index
Type conversions Eliminated

The 8-second runtime is now fast enough for near-real-time dashboard refreshes. The team stopped relying on morning snapshots entirely, and inventory decisions are made on live data.

What I Would Tell My Past Self

Three things I learned from this that I apply to every database I touch now:

Read the execution plan before writing a single fix. You cannot optimize what you haven't measured. The plan tells you exactly where the cost is - don't guess.
Correlated subqueries in SELECT are almost always wrong. If you're selecting a computed value that depends on the current row, you can almost always replace it with a CTE or derived table that runs once.
Design indexes for your queries, not your tables. An index on a single column is often useless if your queries always filter on two columns together. Composite indexes + INCLUDE columns change everything.

"An 82% reduction in query time is not magic - it's just understanding what the database engine is actually doing, and removing the unnecessary work."

The full query pattern and index design are available in my GitHub repository. If you're working on a similar inventory or supply chain database, I'd be happy to discuss your specific schema in the comments.

SQL Database Optimization SQL Server Query Performance Indexing Data Engineering Analytics