Project Background
- Warehouse personnel will have to manually count small-sized products, which come in varying appearances and quantities, as a part of the inbound receiving requirements before the products are received into the warehouse.
- The current manual counting process is time-consuming, labour-dependent, and has a high possibility of human error.
Key Issues
Time Consuming
Time reduction from the current manual counting process could allow operations team to focus on other value-added tasks.
Error Prone
The high possibility of human error will lead to inventory inaccuracy that was not spotted during the inbound receiving stage.
Project Objective
The project aims to reduce the time spent and increase the accuracy rate of the counting process by leveraging technology
Design Process
Solution: Machine Vision Application
Count Calcula (TSI Counting) provides a machine vision-powered solution aimed at enhancing the efficiency of inbound item management. Through our web service and light-box, we automate the process of item counting, thereby boosting accuracy and streamlining operations in warehouse inbound processes. Our solution seamlessly integrates machine vision efficiency into existing counting stations, ensuring smooth workflow continuity without any disruption.
Technology Breakdown
Artificial Intelligence - Machine Vision
Item counting is an Object Detection problem that can be solved with the use of Convolutional Neural Networks.
Criteria for evaluating possible neural networks are:
Accuracy
Speed
Deep Neural Network Architecture Choice:
Ultralytics YOLO v8
YOLO v8 is real fast
Its speed comes from its single-pass neural network. Its region proposal and classification are done in a single pass reducing computational overhead and speeding up the process.
Cutting edge
YOLO is a well-supported project. It has regular releases that improve the network's performance and accuracy. YOLO v8 in conjunction with Anchor boxes, now uses anchor-free detection. More flexible bounding box predictions improve the accuracy of the model.
Light Box
We need to have a controlled environment. Repeatable and consistent lighting and perspective help increase the reliability and accuracy of model results.
Reduce reflections​
The items are wrapped in clear plastic. The direct glare of the ceiling lights can cause reflections that obstruct the view of the items. The top panel uses frosted acrylic to reduce the reflections caused by ceiling lights.
Fixed perspective
To improve the consistency of the model’s performance, the scale and perspective of the items need to be fixed. This reduces the ambiguity caused by different perspectives.
Small footprint
The footprint of the light box is about 41 cm by 31 cm. This makes it easy to deploy at existing counting stations. You don’t need to change the warehouse floor plan or workflow making it easy to deploy.
Building Dataset
Using the light box our client was able to help us take images for the dataset. We took 200 images for each SKU.
Split
The split was applied with random sampling. 120 images were set aside for training and validation datasets. 80 images were set aside for testing. These images were not used in the training pipeline.
Augmentation
To improve the generalisation of the dataset, we use augmentation to increase the size and variation in the training dataset. Augmentations: Flip Horizontally and Vertically, Rotate 180° Upside Down, Blur, Brightness, Colour shift
Background Class
We added new images and selected images from the Daily Items around the World dataset as a null class to improve the accuracy of ignoring background items.
Training Model
One Model per SKU. This helped us achieve:
Scalability
We anticipate that many SKUs will be in use. To enable us to keep adding new SKUs without increasing the complexity of the model, we decided each SKU to have one model to identify it.
Reliability
If we increase the number of SKUs, the model required will be overly complex. Each new SKU would change the weights of the model, this can change the reliability of the results. Hence to maintain reliability each SKU is trained on one model.
Hyperparameter tuning
We tuned hyperparameters Intersection Over Union and Confidence to maximise the accuracy of the results in the test dataset. We used the F1-loss against Confidence curve to identify Confidence values that yield the highest accuracy.
Software Architecture
This is the software that allows users to access and use the model. Being a web app it is also easy to deploy. It does not require specialist equipment or software. Users can just visit and log in to the web app on any Android device and immediately start operations.
The primary features of the web app are:
- Allow users to take photos to feed into the AI Model.
- To create an audit trail of the inbound invoices and the AI results. These records should allow users to review each invoice record.
Architecture Diagram
We used Docker Compose to orchestrate the different components. The backend micro-services are connected within a single Docker Network. The frontend web app is within a separate Docker Network. Client traffic is encrypted through Reverse Proxy servers.
Capturing Records
In order to track the AI model’s results, each record is identified by a unique pair of SKU and Invoice number.
Barcode Scanner​
For data entry of SKU, Invoice and Label Quantity. Scanning speeds up the process. We included barcode encoding for: code 39, code 93, code 128
AI Scanner
Each image is sent to AI API. Each image will have a response to the number of items and an annotated image. The user is also prompted to use either the top or bottom shelf. And to remove from plastic.
Image Storage
The images will be uploaded to the AWS S3 bucket. Image SQL table maps each image to a record in the Records SQL table.
Capturing Records
In order to track the AI model’s results, each record is identified by a unique pair of SKU and Invoice number.
Download records as CSV
All records that match the search query will be downloaded into a single CSV file. This table will hold the SKU, Invoice, Label Quantity, Count Quantity, User ID, and List of image filenames.
Download images in Zip
All images from the records that match the search query are downloaded as a zip file. Each record has a folder with its images.
Archive records
To reduce the clutter of old records, they can be archived. This date can only be set by a supervisor. Only the supervisor has access to the archived records.
Security
Encryption
We deploy reverse proxy servers to encrypt the internal HTTP traffic into HTTPS traffic on the exposed external ports. The SSL/TLSÂ certificate issuance and renewal are done through the Certbot CLI tool by Let’s Encrypt. The certificate renewal and redeploying of the reverse proxy servers are set up as a crontab job.
Identity and Access
The backend API uses JWT tokens and refresh tokens to validate the identity of incoming traffic. The Spring Boot backend also tracks the access rights of each user. Different user access rights limit the access to APIs and Web App features.