Jetson Deployment Checklist: From Unboxing to Production

Last updated: February 2026

TL;DR

Getting a Jetson from unboxing to a hardened, production-ready edge AI node takes 4–8 hours of focused work the first time. This checklist covers every phase: initial flash and JetPack setup, NVMe provisioning, power mode configuration, security hardening, inference pipeline deployment, and ongoing monitoring. Work through it in order — each phase builds on the previous one.

Phase 1: Initial Setup

[ ] Flash JetPack via SDK Manager or sdkmanager CLI
Use the current JetPack release from the NVIDIA developer portal. Flash from a Linux host using SDK Manager, or use the pre-flashed SD card image for developer kits. Verify the JetPack version matches your target software stack.

[ ] Complete initial OS setup (first boot)
Set hostname, timezone, and locale during first-boot setup. Use a hostname that identifies the node's physical location (e.g., edgenode-warehouse-01). Avoid generic hostnames like "jetson" that create conflicts when multiple nodes appear on the same network.

[ ] Update all packages
Run sudo apt update && sudo apt upgrade -y immediately after first boot. JetPack images are rarely current with upstream security patches.

[ ] Verify CUDA, cuDNN, and TensorRT versions
Run dpkg -l | grep -E "cuda|cudnn|tensorrt" to confirm versions installed. Document these versions — they determine which model export and conversion paths are supported. Mismatched runtime versions are a common cause of inference failures.

[ ] Test GPU availability
Run python3 -c "import torch; print(torch.cuda.is_available())" (if PyTorch is installed) or run a TensorRT sample to confirm the GPU is accessible and functional.

Phase 2: Storage and Power Configuration

[ ] Install and format NVMe SSD
Insert the NVMe drive, partition it, and format according to your storage layout plan. For the recommended 4-partition layout, see storage layout and ring buffer design. Mount partitions persistently via /etc/fstab with noatime on the video partition.

[ ] Move Docker root to NVMe
By default, Docker stores layers in /var/lib/docker on the OS drive. On systems with a small eMMC or SD card boot drive, this fills quickly. Update /etc/docker/daemon.json to set "data-root": "/opt/docker" pointing to the NVMe.

[ ] Configure nvpmodel power mode
Select the power mode matching your thermal design: sudo nvpmodel -m 1 for 7W mode (Orin Nano), -m 0 for max power. Make the setting persistent across reboots by editing /etc/nvpmodel.conf. For thermal context, see fanless enclosure thermal constraints.

[ ] Set jetson_clocks for sustained inference performance
Run sudo jetson_clocks to lock CPU, GPU, and EMC clocks to maximum for benchmarking. For production, evaluate whether jetson_clocks or the dynamic clock governor better matches your latency vs. power requirements.

[ ] Verify swap configuration
Check swap with free -h. Jetson enables zRAM swap by default. For production inference nodes where memory pressure should be managed by right-sizing hardware rather than swapping, consider disabling swap to prevent unpredictable latency spikes. See RAM sizing for edge inference for guidance on sizing correctly.

Phase 3: Security Hardening

[ ] Change default OS user password
The default JetPack account uses a well-known password. Change it immediately: passwd. Disable or remove the default account if a dedicated service account is used for deployment automation.

[ ] Configure SSH key authentication; disable password SSH
Add a public key to ~/.ssh/authorized_keys. Then set PasswordAuthentication no in /etc/ssh/sshd_config and restart sshd. This prevents brute-force password attacks on SSH.

[ ] Configure UFW firewall
Enable UFW with a default-deny policy. Allow only necessary ports: SSH (22), RTSP ingestion if cameras connect to the node on a routed path, and any application-specific management ports. Block all inbound traffic by default.

[ ] Disable unnecessary services
Disable services not needed in production: sudo systemctl disable bluetooth, sudo systemctl disable avahi-daemon, sudo systemctl disable cups. Reduce attack surface by running only required services.

[ ] Configure unattended-upgrades for security patches
Install unattended-upgrades and configure it to automatically apply security updates. Review the configuration to exclude kernel updates (which may require JetPack re-flash) while allowing userspace security patches.

[ ] Change default camera credentials
Log in to each IP camera's web interface and change the default username/password. Document credentials securely. Cameras with default credentials are one of the most common entry points for network intrusions.

[ ] Verify VLAN isolation
From a device on the camera VLAN, verify that it cannot reach the management VLAN or the internet (unless explicitly permitted). From the compute node's management interface, verify that it can reach the camera VLAN for RTSP but camera devices cannot initiate connections to the management VLAN.

Phase 4: Inference Pipeline Deployment

[ ] Deploy inference application via Docker
Use the NVIDIA container toolkit with --runtime nvidia for GPU access. Pin container image versions explicitly — avoid :latest tags in production. Test container startup time; long startup delays affect recovery time after node reboots.

[ ] Configure all RTSP stream URLs
Enumerate each camera's RTSP URL in the application configuration file. Use static IP assignments or DNS entries — avoid relying on mDNS or dynamic hostnames. Test each stream with ffprobe before starting the inference pipeline.

[ ] Run TensorRT engine build and verify
TensorRT engines are device-specific — build engines on the target Jetson, not on a development machine. Engine build can take 5–30 minutes depending on model size. Verify engine output with a test image before integrating into the full pipeline.

[ ] Configure ring buffer and log rotation
Set up the ring buffer cleanup process and configure logrotate. Test ring buffer behavior by filling the video partition to capacity during QA. Verify that the cleanup process runs correctly and that no recording gaps result from deletion timing.

[ ] Configure systemd service for auto-start
Create a systemd service unit for the inference application. Set Restart=on-failure and RestartSec=10s for automatic recovery. Enable the service: sudo systemctl enable inference-pipeline.service. Test by rebooting the node and verifying the pipeline starts automatically.

[ ] Validate inference accuracy end-to-end
Run a known test scenario through the pipeline — a person walking in frame, a vehicle of known type, or an object in a known position — and verify that detections are correct and alerts fire as expected. Do not assume the pipeline works correctly in production without end-to-end validation.

Phase 5: Monitoring and Maintenance

[ ] Configure SMART monitoring for NVMe
Install smartmontools and configure smartd to run weekly health checks. Log SMART attributes to a remote monitoring system. Alert when Percentage Used exceeds 80% or Available Spare drops below 20%.

[ ] Set up thermal monitoring
Log junction temperatures via tegrastats or a custom script reading /sys/devices/virtual/thermal/thermal_zone*/temp. Alert when any thermal zone exceeds 90°C. Sustained throttling indicates an enclosure thermal design issue that must be resolved before leaving the node in production.

[ ] Configure NUT for UPS monitoring
If a UPS is installed (recommended — see power and UPS for edge deployments), configure NUT to monitor battery state and trigger graceful shutdown when battery falls below 30% charge or 3 minutes estimated runtime.

[ ] Establish pipeline health metrics
Export metrics from the inference pipeline: frames processed per second, inference latency (p50, p95, p99), dropped frames, missed detections on test scenes. A healthy pipeline should show stable FPS and latency over 24-hour periods with no degradation trend.

[ ] Document node configuration
Record: JetPack version, TensorRT version, model name and version, RTSP stream URLs, power mode, partition layout, NVMe model and serial number, UPS model, and network configuration (IPs, VLANs). Store this in a version-controlled configuration repository. Edge nodes deployed without documentation become unmaintainable.

Checklist Summary Table

Phase Key Tasks Risk If Skipped
1. Initial Setup Flash JetPack, update packages, verify GPU Unpatched CVEs, runtime version mismatch
2. Storage & Power NVMe layout, Docker root, nvpmodel, swap OS storage overflow, thermal throttling, latency spikes
3. Security Hardening SSH keys, firewall, service disable, camera passwords, VLAN verify Network intrusion, camera takeover, data exfiltration
4. Pipeline Deployment Docker, RTSP config, TensorRT build, ring buffer, systemd, validation Pipeline failure on reboot, recording gaps, incorrect inference
5. Monitoring SMART, thermal, UPS, pipeline metrics, documentation Silent failures, undetected drive wear, unrecoverable outages

Common Pitfalls

FAQ

How long does TensorRT engine optimization take?

Engine build time depends on model size and optimization level. YOLOv8n INT8: 3–8 minutes on Orin Nano. YOLOv8l INT8: 15–30 minutes on Orin NX. Build once, cache the engine file. Rebuild only when the model changes or JetPack is updated.

Should I use Docker or run the inference application natively?

Docker is strongly recommended for production deployments. It provides dependency isolation, reproducible environments, and simplified updates via image version pinning. The NVIDIA Container Toolkit provides full GPU access from within Docker containers on Jetson.

How do I remotely access a deployed Jetson?

SSH over the management VLAN is standard. For nodes without a VPN or direct network path, a reverse SSH tunnel or a lightweight VPN (WireGuard) allows secure remote access without exposing SSH to the internet. Never expose SSH directly to a public IP.

What is the recommended JetPack upgrade procedure?

JetPack upgrades (e.g., from 6.0 to 6.1) typically require a full re-flash. Document the current configuration, snapshot any custom configuration files, re-flash the new JetPack, then re-apply configuration from your documentation. A configuration-as-code approach (Ansible, custom provisioning scripts) makes this substantially faster.

How do I handle failed inference pipeline containers?

Configure the systemd service or Docker restart policy (--restart=on-failure:5) for automatic restart. Add a watchdog process that monitors pipeline liveness (e.g., checks that frames are being processed at expected rate) and restarts the container if the pipeline stalls without crashing.

Is there a way to A/B test model versions on a deployed node?

One approach: maintain two model directories (model-a and model-b) and an environment variable in the container configuration that selects the active model. Switching models becomes a container restart with the environment variable changed — no re-deployment of the container image required.