How to Run Keyless Agent
This page talks through how to set-up and run IDV Bridge On-Premise (Keyless agent) and the required configurations.
Infrastructure Requirements
Keyless agent requires the following minimum versions and infrastructure in order to run:
Kubernetes 1.32+
Helm (if the chosen installation method is helm)
In terms of performance we typically find a single assistance of the Keyless Agent will perform as per the below (2000 mCPUs, 8000 Mi RAM):
5 req in parallel
3 sec for each image processed
Horizontal-Scaling Ready (online mode only)
Pull the Docker image
The Docker image is available on Keyless Quay repository. First, execute a docker login:
docker login -u="keyless_technologies+<PROVIDED_TENANT_NAME>" -p="<PROVIDED_PASSWORD>" quay.io
Then, proceed with pulling the container:
dockerpullquay.io/keyless_technologies/keyless-agent:v3.0.0
Running Keyless Agent
Customers can run this on their own as docker image or utilize our helm chart.
Once the docker image is setup to run the service, it is configured via the following environment variables. If there is no default then it is required.
NUM_OF_CIRC
- Number of circuits to create during enrollment (default:25
)
To enable online enrollment:
The following environment variables are required. If not set the online enrollment will be disabled:
AUTH_SERVER_URL
- URL of the keyless auth server (same as host passed to SDK)AUTH_SERVER_API_KEY
- API key for the keyless auth server (same as api key passed to SDK)
To configure the logs:
• LOG_FORMAT
- Log format json
or human
(default: human
).
To configure the HTTP server:
PORT
- port for HTTP server to listen on (default:80
)
To configure concurrency:
• MAX_CONCURRENCY
- Maximum number of concurrent biometric sessions (default: number of CPUs`)
• MAX_WAIT
- Maximum number of requests waiting to be assigned biometric session (default: 1
)
Maximum concurrency overview
Each request for enrollment needs to run biometric session to extract embeddings. There is a maximum number of concurrent biometric sessions that can be run at the same time. If there are more requests than concurrent sessions, the requests will be queued and processed when a session is available. If there are more requests than the maximum number of requests waiting to be assigned a biometric session, the requests will be rejected with HTTP 429 Too Many Requests.
Memory management
The memory used by the service is mostly controlled by MAX_CONCURRENCY
as it denotes how many concurrent biometric sessions can be run at the same time. Each session consumes a certain amount as it needs to load biometric models and run the biometric extraction. The more concurrent sessions, the more memory is consumed. Other less significant factors are:
the number of circuits created during enrollment
the number of requests waiting to be assigned a biometric session
the resolution of photos used for enrollment
Below is the approximate memory consumption for offline enrollment with 25 circuits created during enrollment:
MAX_CONCURRENCY=2
- consumes cca 0.5 GB of memoryMAX_CONCURRENCY=4
- consumes cca 1 GB of memoryMAX_CONCURRENCY=7
- consumes cca 1.5 GB of memoryMAX_CONCURRENCY=9
- consumes cca 2 GB of memory
Performance and throughput
The peak performance of the service is cca 1s to generate offline enrollment with 25 circuits. The biometric extraction is cca 0.5s (depending on the CPU power) and the rest is the time to transfer data over the network and process the request.
1
1
0.6
1
2
1.7
1
4
2.2
1
8
2.2
2
2
0.7
2
4
2.3
2
8
4.5
The MAX_WAIT
is controlling how many requests can be waiting to be assigned a biometric session and therefore how the service handles spikes in the requests. If there are more requests than MAX_WAIT
, the requests will be rejected with HTTP 429 Too Many Requests.
The throughput of the service is mostly determined by the number of concurrent biometric sessions that can be run at the same time and number of CPU cores available.The benchmarks cited above are for the given number of concurrent sessions and number of CPU cores available and are provided purely to demonstrate how max concurrency and CPU cores affect the performance. We advise always measuring on your own hardware for the most accurate performance estimates.
Understanding circuits for Authentication
Last updated
Was this helpful?