Install DeepSeek LLM on NVIDIA Jetson Nano Super for Local AI Automation -

Introduction

This guide will walk you through installing and running DeepSeek LLM on an NVIDIA Jetson Nano Super for local AI automation and research. The Jetson Nano is a powerful yet compact platform for AI workloads, and setting up DeepSeek locally allows you to run inference without relying on cloud-based APIs.

1. Preparing the Jetson Nano

1.1 Install NVIDIA JetPack SDK

Jetson Nano requires JetPack, which includes the OS, CUDA, cuDNN, and other essential libraries.

Steps:

Download JetPack from NVIDIA’s website
Flash the OS to an SD card using Balena Etcher or dd (for Linux users):sudo dd if=JetPack.img of=/dev/sdX bs=4M status=progress
Insert the SD card into the Jetson Nano and boot the device.

1.2 Set Up the System

Once the system boots up:

sudo apt update && sudo apt upgrade -y

Enable swap space for better performance:

sudo fallocate -l 8G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

To make swap permanent, add this line to /etc/fstab:

/swapfile swap swap defaults 0 0

2. Installing CUDA & cuDNN

DeepSeek LLM requires GPU acceleration for better performance.

2.1 Install CUDA Toolkit

sudo apt install nvidia-cuda-toolkit

Check the installation:

nvcc --version

2.2 Install cuDNN

Download cuDNN from NVIDIA Developer Zone
Extract and install:tar -xvzf cudnn-local-repo-*.tar.gz cd cudnn* sudo cp cuda/include/cudnn*.h /usr/local/cuda/include/ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64/ sudo chmod a+r /usr/local/cuda/include/cudnn*.h /usr/local/cuda/lib64/libcudnn*
Verify installation:cat /usr/local/cuda/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

3. Setting Up Python & Virtual Environment

3.1 Install Python & Pip

sudo apt install python3 python3-pip

3.2 Create a Virtual Environment

python3 -m venv deepseek_env
source deepseek_env/bin/activate

Upgrade Pip & Install Dependencies:

pip install --upgrade pip
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu118
pip install transformers sentencepiece

4. Installing DeepSeek LLM

4.1 Download the Model

DeepSeek LLM is available on Hugging Face. To download it:

pip install huggingface_hub
huggingface-cli login  # If required

Then, pull the model:

mkdir deepseek_model && cd deepseek_model
wget https://huggingface.co/deepseek-ai/deepseek-llm-7b/resolve/main/pytorch_model.bin
wget https://huggingface.co/deepseek-ai/deepseek-llm-7b/resolve/main/config.json
wget https://huggingface.co/deepseek-ai/deepseek-llm-7b/resolve/main/tokenizer.json

4.2 Load the Model in Python

Create a Python script (run_deepseek.py):

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-llm-7b")
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-llm-7b", torch_dtype=torch.float16, device_map="auto")

def generate_response(prompt):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    output = model.generate(**inputs, max_new_tokens=100)
    return tokenizer.decode(output[0], skip_special_tokens=True)

print(generate_response("Hello, how can AI help me?"))

Run the script:

python run_deepseek.py

5. Optimizing for Local AI Automation

5.1 Enable TensorRT Optimization (Optional)

TensorRT improves model inference speed:

sudo apt install tensorrt

Modify the model loading to use TensorRT:

import torch
from torch2trt import torch2trt
model_trt = torch2trt(model, [inputs])

5.2 Run DeepSeek as a Local API

You can serve DeepSeek LLM as an API using FastAPI:

pip install fastapi uvicorn

Create api.py:

from fastapi import FastAPI
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

app = FastAPI()
tokenizer = AutoTokenizer.from_pretrained("deepseek-ai/deepseek-llm-7b")
model = AutoModelForCausalLM.from_pretrained("deepseek-ai/deepseek-llm-7b", torch_dtype=torch.float16, device_map="auto")

@app.get("/generate")
def generate(prompt: str):
    inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
    output = model.generate(**inputs, max_new_tokens=100)
    return {"response": tokenizer.decode(output[0], skip_special_tokens=True)}

Run the API:

uvicorn api:app --host 0.0.0.0 --port 8000

Now, you can call the API:

curl "http://localhost:8000/generate?prompt=Tell+me+about+AI"

6. Testing & Integration

6.1 Testing Model Performance

time python run_deepseek.py

Use htop to monitor CPU/GPU usage:

htop
watch -n1 nvidia-smi

6.2 Automating AI Tasks

Integrate with n8n for AI-driven workflows.
Use Cron Jobs to schedule AI tasks.
Connect with Telegram bots for interactive AI responses.

Conclusion

This guide sets up DeepSeek LLM on an NVIDIA Jetson Nano Super for local AI automation. You can now use it for automation, research, or running your own local AI assistant without cloud dependencies.

Next Steps

Fine-tune DeepSeek for your use case.
Optimize model performance with TensorRT and quantization.
Expand AI automation by integrating with n8n, APIs, and local workflows.

Install DeepSeek LLM on NVIDIA Jetson Nano Super for Local AI Automation