SleepCam is a native iOS application that captures a low-resolution photo of the user roughly once per minute through the night, optionally records audio, and presents the recording as a fast-scrubbing photo timelapse with optional Apple Health sleep-stage overlay. All data stays on the user's device. There is no server, no account, no cloud sync, no analytics, and no third-party SDK. The app is built in Swift / SwiftUI for iOS 18 and later, on iPhone hardware that includes a TrueDepth front camera (iPhone X and later).
Designing an overnight, self-pointed, completely-silent, dark-room camera that runs unattended on an off-the-shelf iPhone exposes a number of OS-level constraints that aren't obvious until you try. This document walks through the specific technical choices SleepCam makes, with enough detail to enable a competent iOS engineer to reproduce them.
An iPhone resting on a nightstand pointed at a sleeping person has effectively no ambient light. The standard front camera's exposure system will either return a black image or push ISO so high the image is uselessly noisy. The phone's display flashlight or screen-as-flashlight tricks are not options because they would wake the sleeper.
iPhones with a TrueDepth front camera include two emitters that operate in the near-infrared band around 940 nm: a flood illuminator (a diffuse IR source used by Face ID to even out ambient lighting) and a dot projector (a structured-light source for depth). Of these, only the flood illuminator is useful here — it produces a uniform IR field rather than a structured pattern.
The TrueDepth front camera's image sensor includes an IR-cut filter, but the cut is imperfect and a measurable fraction of 940 nm light passes through and registers, particularly in the red channel. In practice, in a dark room with the flood illuminator active, the front camera produces a usable, low-noise grayscale-tinted image of a subject within roughly 1–1.5 m, more than sufficient for distinguishing sleep position, head orientation, and major body movement at the resolution SleepCam actually needs (≤ 1280 × 960).
The flood illuminator is not directly exposed as a configurable AVFoundation device. However, it is implicitly activated whenever the system uses the TrueDepth depth pipeline (e.g., a session configured with depth output) or whenever certain configurations of the TrueDepth front camera's auto-exposure compensate for low light. SleepCam configures an AVCaptureSession with the front builtInTrueDepthCamera device, attaches an AVCaptureVideoDataOutput, and uses session and device configurations that result in the flood illuminator being on during capture. The user is informed of this behavior and consents to it via standard iOS camera-permission prompts.
AVCaptureDevice.default(.builtInTrueDepthCamera, for: .video, position: .front).setTorchMode..hd1280x720 or lower). Lower resolution reduces per-frame storage cost and reduces motion-blur sensitivity at the long shutter times the auto-exposure system will choose in low light.iOS, by regulatory default in some locales (notably Japan and Korea) and by user-perceptible default elsewhere, plays a synthesized shutter sound whenever AVCapturePhotoOutput's capturePhoto(with:delegate:) is invoked. There is no public API to silence it across all locales. A click every 60 seconds in a dark bedroom would defeat the entire product premise.
SleepCam does not use AVCapturePhotoOutput at all. The capture session is configured with an AVCaptureVideoDataOutput; sample buffers stream continuously to the application via the AVCaptureVideoDataOutputSampleBufferDelegate on a dedicated dispatch queue. At each capture-interval tick, the application captures the next sample buffer's CVPixelBuffer from the running video stream, converts it to a JPEG (or HEIC) at the configured resolution, and writes it to the on-device session folder. The video output stream is not recorded as a movie — only single frames are persisted.
This bypasses the still-photo audio policy entirely. The capture is silent in every locale.
// Pseudocode sketch of the per-tick capture path:
func captureOutput(_ output: AVCaptureOutput,
didOutput sampleBuffer: CMSampleBuffer,
from connection: AVCaptureConnection) {
guard captureNowFlag.exchange(false) else { return }
guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
let ciImage = CIImage(cvPixelBuffer: pixelBuffer)
let jpeg = ciContext.jpegRepresentation(
of: ciImage,
colorSpace: CGColorSpaceCreateDeviceRGB(),
options: [.quality: 0.7]
)
try? jpeg?.write(to: nextFrameURL(for: session))
// Also write a downsampled thumbnail sidecar at this point — see §5.
}
A timer (a foreground-app DispatchSourceTimer or a periodic scheduler keyed off wall-clock time, not CADisplayLink) sets captureNowFlag at the configured cadence. Between captures, sample buffers from the video output are simply discarded.
iOS does not allow a third-party application to capture from the camera while in the background. Apps that have tried to extend background activity through location-permission keep-alive hacks have been rejected from the App Store. SleepCam therefore must run in the foreground throughout an overnight session. The screen on a 7+ hour session must therefore be effectively invisible — both for the sleeper's comfort and for battery preservation.
audio (because audio recording, if enabled, must not stop when the device locks the display).UIScreen.main.brightness to a very low value (the practical floor on most iPhones is around 0.0–0.01, which on OLED hardware renders effectively black). The original brightness is restored when the session ends or the app is foregrounded for review.UIApplication.shared.isIdleTimerDisabled = true so the device does not auto-lock and stop the foreground capture.The combination — accept the foreground constraint, dim the screen, declare only audio background mode, plug in — sidesteps both the OS limitation and the App Store review risk.
HealthKit's HKCategoryType for sleep analysis can return overlapping samples from different sources. In particular, an Apple Watch typically writes both a broad inBed envelope sample covering the whole night and finer asleepCore, asleepDeep, asleepREM, and awake samples for sub-intervals within that envelope. Naïvely rendering all returned samples produces a band where the broad envelope visually swamps the finer staged data.
SleepCam fetches all sleep samples in the session window via HKSampleQuery, then runs a two-pass dedupe before rendering:
asleepCore, asleepDeep, asleepREM, awake.inBed), subtract the union of all staged intervals from it. The remaining sub-intervals — those parts of the envelope not covered by any richer staged sample — are kept as envelope-class intervals. Sub-intervals shorter than a minimum (60 seconds) are discarded as visually noisy.// Pseudocode for the dedupe.
let staged = intervals.filter { $0.stage.isStaged }
let envelopes = intervals.filter { !$0.stage.isStaged }
func subtract(staged: [Interval], from envelope: Interval) -> [Interval] {
var pieces: [(Date, Date)] = [(envelope.start, envelope.end)]
for s in staged where s.end > envelope.start && s.start < envelope.end {
var next: [(Date, Date)] = []
for (pStart, pEnd) in pieces {
if s.end <= pStart || s.start >= pEnd {
next.append((pStart, pEnd)); continue
}
if s.start > pStart { next.append((pStart, s.start)) }
if s.end < pEnd { next.append((s.end, pEnd)) }
}
pieces = next
if pieces.isEmpty { break }
}
return pieces
.filter { $0.1.timeIntervalSince($0.0) >= 60 }
.map { Interval(stage: envelope.stage, start: $0.0, end: $0.1) }
}
let trimmedEnvelopes = envelopes.flatMap { subtract(staged: staged, from: $0) }
return (staged + trimmedEnvelopes).sorted { $0.start < $1.start }
HealthKit does not reliably report whether read authorization was granted (its authorizationStatus(for:) returns share/write status). SleepCam therefore treats an empty result set as "either no data or denied," and surfaces a soft hint pointing the user at the Health app's privacy settings rather than asserting authorization state.
A 7-hour session at 60-second cadence produces ~420 photos. At 1× playback speed (one photo per real second), this is fine. At 10× — 10 photos per real second — naïve SwiftUI Image rendering produces visible stutter on iPhone X / iPhone 11-class hardware because SwiftUI re-evaluates view bodies and diffs the view tree at each frame change. The decode-and-display path also dominates the per-frame budget when full-resolution JPEGs are decoded for every step of the scrubber.
sessions_index.json) at the top of the on-device sessions directory caches enough metadata about each session (id, start time, duration, frame count, cover thumbnail name) to render the home screen instantly without enumerating directories. The index is updated whenever a session ends or is deleted; it is rebuilt from the filesystem if missing or corrupt.Image diffs at the SwiftUI-tree level and is too slow at 10 Hz on older devices. A UIImageView wrapped in UIViewRepresentable, with the underlying image property assigned directly each tick, bypasses SwiftUI diffing.CGImageSourceCreateThumbnailAtIndex with kCGImageSourceShouldCacheImmediately. ImageIO's thumbnail path is faster than full-frame decoding and produces a CGImage already in the cached, render-ready state. This avoids a re-decode the first time the image is drawn.NSCache<NSNumber, UIImage> keyed by frame index. On each tick it prefetches the ±8 frames around the cursor on a background queue. NSCache automatically evicts under memory pressure.let displayLink = CADisplayLink(target: self, selector: #selector(tick))
displayLink.preferredFramesPerSecond = 30
displayLink.add(to: .main, forMode: .common)
@objc func tick() {
let now = CACurrentMediaTime()
let elapsed = now - playbackStartedAt
let virtualSeconds = elapsed * playbackSpeed // 1×, 5×, 10×
let targetIndex = Int(virtualSeconds * captureRateHz)
if let image = cache.object(forKey: NSNumber(value: targetIndex)) {
imageView.image = image
currentIndex = targetIndex
} else {
// Skip-on-miss: do nothing this tick. Prefetcher will deliver soon.
// (Optionally, on a long stall, fall back to a decoded thumbnail
// synchronously to avoid a frozen viewport.)
}
schedulePrefetch(around: targetIndex, window: 8)
}
The user wants to know when something audible happened during the night — a snore, a cough, a partner moving — without the app making any classification claims it can't back up. ML-based snore detection is a meaningfully different product surface and a regulatory minefield.
If the user enables audio, SleepCam records the entire session as a single AAC file via AVAudioRecorder in a background-capable audio session (.playAndRecord with .mixWithOthers off, .duckOthers off, .allowAirPlay off; route override left to system).
For visualization, SleepCam computes a per-bin RMS amplitude over the recorded file (or a streaming buffer during live recording) at a fixed bin resolution (e.g., 1 bin per 5–10 seconds). Bins exceeding a threshold relative to the session's median amplitude are flagged as "loud regions." The waveform UI renders these regions in red over the otherwise-neutral waveform color. There is no audio classification — no snore/cough/talk detection, no model inference, no labels beyond "loud" and "not loud."
SleepCam captures imagery of a sleeping person, often including the user's face. The application contains no networking code beyond standard URL-handling and the user-initiated share-sheet path. There is no analytics SDK, no advertising SDK, no crash-reporting SDK, no third-party SDK of any kind. Photos and audio are written into the application's private container and never read or transmitted by SleepCam itself.
The app does not perform face detection, ARKit face tracking, Vision face APIs, or any biometric extraction. No face template, face mesh, face print, or facial-landmark data is computed or stored. The captured 2D images are written to disk and read back only for display in the on-device playback UI.
This architectural posture is documented in the SleepCam Privacy Policy and is a load-bearing product claim, not just a configuration. It is described here for completeness — privacy as an architectural absence is not, in itself, a patentable invention, but its specific implementation choices (e.g., the deliberate non-use of face-detection APIs in an app whose primary capture surface is a face-pointed camera) are part of the public record as of this publication date.
The novel value of SleepCam, to the extent novelty is even claimed, lies in the specific combination of the foregoing:
The author considers no individual element above to be novel in isolation — every one is built from public iOS APIs, well-documented signal-processing primitives, and standard data-structure operations. The combination as deployed is published here as prior art so that no party may subsequently obtain patent claims that read on it.
Citation. Huber, R. SleepCam Engineering Notes & Defensive Publication. May 8, 2026. https://sleepcamapp.com/engineering.
Comments or corrections: ryan@huber.llc.