✆ + 1-646-235-9076 ⏱ Mon - Fri: 24h/day
AI + IoT = Smart Surveillance: How Technology Helps Make Cities Safer


Imagine a city equipped with a system that can predict a crime before it even happens. It might sound like science fiction, a movie plot, or even a real-life version of Orwell’s Thought Police from 1984. But this is quickly becoming our reality thanks to the rise of Artificial Intelligence (AI) and its integration with Internet of Things (IoT) systems.
The world is changing rapidly – and cities are no exception. According to the World Health Organization, more than half of the global population already lives in urban areas. This means more movement, more people, and unfortunately, more risks: crime, accidents, and large crowds that are increasingly difficult to manage.
Traditional surveillance systems fall short in this context. Cameras simply record footage, which must later be reviewed manually – often hours of video – just to find a single relevant moment. It’s far too slow, especially when public safety is at stake. Even with real-time monitoring, a significant amount of human resources is required to maintain an effective level of oversight.
This is where modern technologies step in. With IoT, we can connect dozens of devices – cameras, microphones, motion sensors, sound detectors, temperature or even air quality sensors – into a single ecosystem. AI can process this data in real-time: recognizing faces, detecting suspicious behavior, and identifying potential threats as they emerge. It all sounds promising – but it raises important questions: what hidden risks and dangers could arise, and how might this impact people’s everyday lives?
Possible system configurations
Let’s start with the fact that for many people, such a smart surveillance system might seem like magic – and not just for those without a degree in Software Engineering. Even for developers themselves, this can be a mystery. And it’s no wonder: designing and building such a system is a serious engineering challenge. Setting it up to work quickly and efficiently? That’s an engineering challenge times three.
The simplest effective architecture for a street video surveillance system that can be deployed by an experienced IoT/AI engineer consists of several clearly separated layers. At the edge level, IP cameras with streaming protocol support (e.g., RTSP, ONVIF) are installed at critical urban locations – intersections, near public buildings, or in high-risk areas. These cameras may be connected to local edge computing devices (such as NVIDIA Jetson), which perform basic video stream processing (motion detection, object recognition) directly on-site. This reduces network traffic and latency during data transmission.
From edge devices, event data and metadata (e.g., «person detected at 22:31», coordinates, camera_ID) are transmitted via secure networks (VPN, TLS, WireGuard) to the central infrastructure, which can be hosted in the cloud (e.g., AWS, GCP) or on municipal servers. Streaming services such as Apache Kafka are used to provide high-throughput messaging and reliable event queues.
In the cloud, advanced AI-powered video analysis is performed: face identification, behavior analysis, object tracking – using frameworks like TensorFlow, OpenCV, YOLOv8. Models are deployed in Kubernetes clusters with GPU support, enabling horizontal scaling depending on the number of video streams.
The collected analytical events are stored in the analytics layer, using databases like PostgreSQL (with PostGIS for spatial analytics), Elasticsearch (for facial/event search), or time-series databases like InfluxDB. These are then visualized through a web interface for operators (built with React or Angular + Leaflet/Mapbox for mapping). Finally, a rule-based alerting system (e.g., Apache Flink or Prometheus Alerts) suggests appropriate operator actions – such as dispatching a patrol or logging the incident.


Advantages of This Architecture:
- Reduced latency – edge analytics shortens response time.
- Scalability – new cameras or locations can be added without full system refactoring.
- Lower system load – thanks to pre-processing on edge devices.
- Security – ensured via TLS, VPN, and restricted API access.
Disadvantages:
- Cost – edge devices with GPUs are expensive.
- Administrative complexity – many components require constant updates and monitoring.
- High network demands – stable VPN connections and sufficient bandwidth are critical.
Is it already implemented somewhere?
And if you’re truly inspired by the idea – and happen to have a few extra million to invest in development – don’t wait too long. Some of these surveillance systems have already been in use for years.
For example, in Singapore, there’s a system capable of detecting unattended items in public spaces. Someone leaves a suitcase at a train station – it might be simple forgetfulness, or it might be a potential threat.
In New York City, USA, the police use the Domain Awareness System, which integrates over 18,000 cameras along with data from license plate readers, 911 calls, and other sources to monitor and analyze citywide activity.
In São Paulo, Brazil, the Smart Sampa system includes 25,000 facial recognition cameras, which helped authorities apprehend over a thousand criminals within the first six months of operation.
In Shenzhen, China, AI and IoT are being heavily integrated into the city’s infrastructure to improve safety and traffic management. The local police, in collaboration with Huawei, have created a system that collects data from more than 700 million vehicles every month.
During the 2024 Olympic Games in Paris, an AI system was deployed to ensure safety at large-scale events. It analyzed footage from 485 cameras in real time, detecting unattended objects, unusual crowd movement, and other potential threats.
All of this can be neatly summed up with a quote from renowned Stanford computer science professor Fei-Fei Li: «AI-powered surveillance is no longer a futuristic concept – it’s already reshaping how cities operate.»
It’s hard to argue with that, although there are important caveats. At this point, it’s difficult to find reliable data or research confirming the effectiveness of such systems. And frankly, the fact that these practices are not yet widely adopted might suggest that a truly ideal smart surveillance system doesn’t yet exist. Still, progress is underway — and it’s safe to say that with each passing day, these systems are improving. Technological advancement doesn’t stand still.
Challenges encountered
Some of the current challenges developers face when building smart surveillance systems are highlighted in a study conducted at MIT (Massachusetts Institute of Technology). In this research, various AI models integrated into an IoT-based surveillance system were tested – but not in a city; in a house. Even in that limited setting, the system faced both trivial and quite unexpected issues: false police alerts, failure to detect real threats, and even racially biased behavior from the surveillance system itself.
And that was just in a single home. The problems that might arise from implementing a similar system in an entire city – or even just one street or neighborhood – would likely scale with the complexity of the environment.
And these are only the technical problems on the way to developing a properly functioning product. There is also another issue – the ethical one. It was highlighted during the testing of the system at the Olympic Games in Paris, where a key concern was the lack of transparency surrounding the surveillance system. Who collected the data? How was it collected? What kind of data was gathered, and for what purpose? If it was for safety, then what was done with the data afterward? Where is it stored now? And these are just a few of the many questions being raised.
To put it simply – imagine a city where cameras are operating 24/7 on every street, with systems that analyze faces. It’s clear that someone has access to that system, and therefore access to data about a person’s daily movements. (Of course, this depends on the scale of the system, but if we’re talking about a city-wide system, it’s quite realistic that a person could be tracked from the moment they leave their home to the moment they return.)
In other words, a privacy issue arises. Which means that in addition to technical implementation, there is also the matter of data security, as well as legal considerations. And while the goal of such systems may be positive, the question remains – what if the data ends up in the wrong hands? The balance between security and privacy is one of the greatest challenges these systems face.
Conclusion
To sum it all up – the idea behind such a system is promising, and the reasons for its development are well-intentioned. But there are many nuances that need to be addressed before it can be deployed on a large scale. The technology itself is neutral – what matters is how it will be used. Developers, in particular, are the ones who should be asking the most critical questions: why, how, and for what purpose are they creating something? And as for the question: «Are you ready to live in a city where you’re always being watched – even if it’s for your safety?» – that’s something each person must answer for themselves.