Optimizing Image Search System Architecture with Client‑Side Feature Extraction Using MobileNet
This article explains the architecture of an image‑search system that extracts feature vectors, stores them in a vector database, and performs similarity queries, then proposes an optimized design that offloads feature extraction to a lightweight MobileNet model running in the browser, reducing latency, server load, and component complexity.
Preface : An image‑search system extracts feature vectors from images and uses a vector database for insertion, deletion, and similarity retrieval, enabling searches for visually similar images.
System Architecture : The typical architecture includes the client uploading images, the server storing them in object storage, inserting structured data into a relational database, sending a message to an MQ, and a search service that consumes the MQ message, downloads the image, extracts its feature vector with a model, and inserts the vector into the vector database.
Image upload process :
Client uploads the image to the server.
Server stores the image in object storage, inserts structured data into a relational database, and sends a message to an MQ.
Server returns a response to the client.
The image‑search service consumes the MQ message, downloads the image, extracts its feature vector using a specific model, and inserts the vector into the vector database.
Reasons for using MQ :
Asynchronous fast response – feature extraction is time‑consuming, so handling it asynchronously improves client experience.
Decoupling of services – the feature‑extraction component is typically written in Python (computer‑vision ecosystem) while backend services may be in Java, Go, Node.js, etc., requiring heterogeneous, loosely‑coupled services.
Peak‑shaving – user uploads exhibit bursty traffic; MQ smooths downstream computation load.
Image search process :
Client uploads an image to the server.
Server forwards the image to the image‑search service, which extracts the feature vector, queries the vector database for similar vectors, and returns the search results.
Server retrieves structured data based on the vector‑search results, aggregates the information, and responds to the client.
Performance bottlenecks :
Long image transfer chain: client → server → object storage → image‑search service.
Feature computation is computationally intensive and consumes significant server resources.
Optimized architecture using a client‑side model :
To further improve the system, feature extraction can be moved to the client using a lightweight model such as MobileNet.
Client‑side image upload process :
Client requests a direct‑upload URL from the server and uploads the image directly to object storage.
Client performs local computation to extract the feature vector and sends the vector together with structured data to the server.
Server inserts the structured data into a relational database and the vector into a vector database, then responds.
Client‑side image search process :
Client extracts the feature vector locally and sends the vector and structured data to the server.
Server performs vector retrieval and structured‑data lookup, aggregates the results, and responds.
Benefits of the optimized architecture :
Shorter transfer chain – only client → object storage.
Feature computation is offloaded to the client, freeing server resources.
MQ and the dedicated image‑search service are eliminated, simplifying the architecture.
Feasibility and constraints of the client model : Clients have limited, non‑scalable hardware, so the model must be small and low‑cost; MobileNet fits these requirements.
Example: Front‑end MobileNet feature extraction :
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<script src="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script>
<script src="https://cdn.jsdelivr.net/npm/@tensorflow-models/mobilenet"></script>
</head>
<body>
<input type="file" id="imageInput">
<button onclick="extractFeatures()">Extract Features</button>
<pre id="result"></pre>
<script>
let model;
async function loadModel(){
if(!model){
// Loading MobileNet will download from storage.googleapis.com
model = await mobilenet.load({version:2, alpha:1.0});
}
return model;
}
function preprocessImage(image){
const tensor = tf.browser.fromPixels(image)
.resizeNearestNeighbor([224,224])
.toFloat()
.expandDims();
return tensor.div(255.0);
}
async function extractFeatures(){
const input = document.getElementById('imageInput');
if(input.files.length===0){
alert('Please select an image file first.');
return;
}
const model = await loadModel();
const timeStart = Date.now();
const file = input.files[0];
const reader = new FileReader();
reader.onload = async function(e){
const image = new Image();
image.src = e.target.result;
image.onload = async function(){
const processedImage = preprocessImage(image);
const features = model.infer(processedImage, false); // remove final dense layer
const featuresArray = await features.array();
document.getElementById('result').textContent = JSON.stringify(featuresArray, null, 2);
console.log(`Extract feature spend: ${Date.now() - timeStart} ms`);
};
};
reader.readAsDataURL(file);
}
</script>
</body>
</html>A simple test on a laptop shows that extracting features for a single image completes in under one second, though actual time depends on hardware and image size.
For further reading, see the related articles linked at the end of the original post.
References :
https://keras.io/api/applications/
https://www.tensorflow.org/js/models
https://github.com/tensorflow/tfjs-models/tree/master/mobilenet
System Architect Go
Programming, architecture, application development, message queues, middleware, databases, containerization, big data, image processing, machine learning, AI, personal growth.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.