Using LLM + Home Assistant to Detect Delivery Shippers at the Gate

Using LLM + Home Assistant to Detect Delivery Shippers at the Gate

Problem: Traditional AI cameras like Frigate can only detect a "person" but cannot understand the occupational context – they don't know whether it's a delivery person, a guest, or someone just passing by.

Solution: Combine a Large Language Model (LLM) with Home Assistant to perform deeper image analysis, detect visual signs of delivery workers (shippers), and automatically count them.

Workflow

1. Camera Detection
   ↓
2. Person Detected?
   ├─ Yes → AI Analysis
   │          ↓
   │       3. Gemini Flash
   │          ↓
   │       4. Count Shipper
   │          ↓
   │       5. > 0 Shipper?
   │          ├─ Yes → Update Counter
   │          │          ↓
   │          │       6. Auto Reset (5min)
   │          └─ No → Reset (5min)
   └─ No → End

Code Implementation

1. Trigger & Conditions

alias: Detect human shipper at the gate
description: ""
triggers:
  - topic: frigate/events
    trigger: mqtt
conditions:
  - condition: template
    value_template: |
      {{ trigger.payload_json['after']['label'] == 'person' }}

Only triggers when Frigate detects a person – an initial filter step to avoid spamming the LLM.

2. LLM Integration

actions:
  - action: ai_task.generate_data
    response_variable: shipper_count
    data:
      instructions: >-
        You are an image analysis assistant.

        Task: Đếm số shipper (người giao hàng) đang đứng trước cổng trong hình.

        Rules: - Chỉ dựa vào dấu hiệu nhận diện nghề nghiệp: đồng phục giao
        hàng, mũ bảo hiểm có logo, áo khoác hãng vận chuyển, túi giao hàng,
        thùng hàng, xe máy giao hàng, hành vi giao nhận. - Không nhận diện danh
        tính cá nhân. - Chỉ đếm người có dấu hiệu rõ ràng là shipper. - Bỏ qua
        người đi đường hoặc người không có dấu hiệu shipper. - Nếu cổng không rõ
        hoặc góc máy không thấy cổng, chỉ đếm shipper trong khu vực trước
        camera. - Trả về đúng một dòng tiếng Việt theo format: "shipper_count:
        <number>" Output: Chỉ một dòng duy nhất, không xuống dòng khác.
      entity_id: ai_task.gemini_flash
      attachments:
        media_content_id: media-source://camera/camera.gate_camera
        media_content_type: application/vnd.apple.mpegurl
        metadata:
          title: Gate Camera
          thumbnail: /api/camera_proxy/camera.gate_camera
          media_class: video
          children_media_class: null
          navigateIds:
            - {}
            - media_content_type: app
              media_content_id: media-source://camera
      task_name: Camera AI

Key points:
- Prompt Engineering: Detailed instructions about visual cues to identify shippers
- Privacy: Instruct the LLM not to recognize personal identities
- Structured Output: Standard format "shipper_count:" for easy parsing
- Media Attachment: Directly sending the stream from the camera

3. Data Processing & State Management

  - if:
      - condition: template
        value_template: >-
          {{ (shipper_count['data'] | regex_findall_index('([0-9]+)', 0) |
          int(0)) > 0 }}
    then:
      - action: counter.set_value
        data:
          value: >-
            {{ shipper_count['data'] | regex_findall_index('([0-9]+)', 0) |
            int(0) }}
        target:
          entity_id:
            - counter.shipper_count
      - delay:
          minutes: 5
      - action: counter.reset
        data: {}
        target:
          entity_id:
            - counter.shipper_count
mode: single
  • Regex Parsing: Extract the number from the LLM response
  • Conditional Logic: Only update when there is at least one shipper (> 0)
  • Auto Cleanup: Reset after 5 minutes

Advanced Use Cases

Multi-language Support

instructions: >-
  Task: Count delivery persons at gate.
  Output format: "shipper_count: <number>"
  Language: English

Notification Integration

      - action: notify.mobile_app
        data:
          title: "🛵 Shipper detected"
          message: >-
            {{ shipper_count['data'] | regex_findall_index('([0-9]+)', 0) }}
            shipper(s) at the gate

History Tracking

      - action: recorder.log
        data:
          message: >-
            Shipper detected: {{ shipper_count['data'] }}
          level: info

Performance Considerations

Cost Optimization

  • Use Gemini Flash instead of Pro for real-time scenarios
  • Implement rate limiting to avoid spamming LLM calls
  • Cache results for similar frames

Latency Management

  • Pre-process with local AI before sending to the LLM
  • Use streaming responses for faster feedback
  • Implement timeout mechanisms

Security & Privacy

  1. Local processing: All images are processed locally before being sent to the LLM
  2. Data minimization: Only send frames where a person is detected
  3. Anonymity: The LLM is instructed not to identify personal identities
  4. Storage policy: Automatically delete media after processing

Debugging

LLM Response Format

# Test template
{{ shipper_count['data'] }}
# Should return: "shipper_count: 2"

Camera Integration

# Verify camera stream
media_content_id: media-source://camera/camera.gate_camera

Regex Extraction

# Debug regex
{{ shipper_count['data'] | regex_findall_index('([0-9]+)', 0) }}

Future Advanced Features

  1. Multi-object detection: Extend detection to postal workers, delivery trucks, etc.
  2. Behavior analysis: Analyze behaviors (waiting, delivering, leaving, etc.)
  3. Smart lock integration: Automatically open the gate when a shipper is confirmed
  4. Voice notifications: Announce via voice notifications

Conclusion

Combining LLMs with Home Assistant turns a smart home from "reactive" to "proactive" – not only reacting but also understanding and predicting needs. From the simple task of counting shippers to more complex applications like behavior analysis, LLMs bring "real intelligence" to automation systems.


Have you implemented similar AI-powered automations? Share your experience!

Bạn đã đăng ký thành công vào Geek Playground
Tuyệt vời! Tiếp theo, hoàn tất thanh toán để có quyền truy cập đầy đủ vào Geek Playground
Chào mừng trở lại! Bạn đã đăng nhập thành công.
Thành công! Tài khoản của bạn đã được kích hoạt đầy đủ, bạn hiện có quyền truy cập vào tất cả nội dung.
Thành công! Thông tin thanh toán của bạn đã được cập nhật.
Cập nhật thông tin thanh toán không thành công.