06 / 09, 2024

Perception

A survey style autonomous driving perception stack covering segmentation, detection, tracking, BEV, and 3D perception.

Role

Engineer, computer vision and autonomous driving perception experiments

Stack

Deep learning · Traditional CV · Data augmentation · Transfer learning · Road segmentation · Object tracking

The Problem

The README describes the project as a demonstration of current computer vision and perception techniques used in autonomous driving, with the goal of safer and more widely accepted self driving systems.

The scope spans multiple perception tasks rather than a single model: road segmentation, 2D object detection, object tracking, 3D visualization, multi task learning, birds eye view, and 3D object detection.

The Architecture

01Perception task collection

The repository groups road segmentation, detection, tracking, BEV, multi task learning, and 3D perception into one autonomous driving learning track.

02Mixed methods

The README calls out deep learning, traditional computer vision techniques, data augmentation, and transfer learning as the main methods used across experiments.

03Evaluation against real constraints

The results and challenges sections emphasize strong performance for lane detection, object detection, and traffic sign recognition, while acknowledging limited data, hardware limits, and real time requirements.

Decisions that mattered

1.

Cover breadth before path planning

The project focuses on perception building blocks first, leaving obstacle avoidance and path planning as future extensions.

2.

Keep traditional CV in the toolbox

The README notes that traditional computer vision can still be effective, especially where data is limited, so the case study no longer frames the stack as purely neural.

3.

Treat real time performance as a challenge

Autonomous driving perception must keep up with the environment, making hardware constraints and latency part of the project story.

The Numbers

7

perception areas

BEV

birds eye view

3D

object detection

RT

performance target

What it taught me

Self driving perception is a stack of related tasks, not one isolated detector.

Limited data and hardware constraints shape model choices as much as benchmark accuracy does.