← Home

SleepCam Engineering Notes

Defensive technical disclosure · Published May 8, 2026 · Author: Ryan Huber (Huber, LLC) · Reddit: u/Mission_Cheetah_5114

Defensive publication notice. The techniques described below are placed in the public domain as prior art. They are published, dated, and indexable so that any subsequent patent application by any party covering the same combinations of elements can be challenged on novelty (35 U.S.C. §102) and obviousness (§103) grounds. Readers are free to study, implement, and extend these techniques. Nothing here should be construed as patent licensing — but everything here is now part of the documented public record as of the publication date above.

Background

SleepCam is a native iOS application that captures a low-resolution photo of the user roughly once per minute through the night, optionally records audio, and presents the recording as a fast-scrubbing photo timelapse with optional Apple Health sleep-stage overlay. All data stays on the user's device. There is no server, no account, no cloud sync, no analytics, and no third-party SDK. The app is built in Swift / SwiftUI for iOS 18 and later, on iPhone hardware that includes a TrueDepth front camera (iPhone X and later).

Designing an overnight, self-pointed, completely-silent, dark-room camera that runs unattended on an off-the-shelf iPhone exposes a number of OS-level constraints that aren't obvious until you try. This document walks through the specific technical choices SleepCam makes, with enough detail to enable a competent iOS engineer to reproduce them.

1. Dark-room capture using the TrueDepth flood illuminator as a near-IR fill source

Problem

An iPhone resting on a nightstand pointed at a sleeping person has effectively no ambient light. The standard front camera's exposure system will either return a black image or push ISO so high the image is uselessly noisy. The phone's display flashlight or screen-as-flashlight tricks are not options because they would wake the sleeper.

Approach

iPhones with a TrueDepth front camera include two emitters that operate in the near-infrared band around 940 nm: a flood illuminator (a diffuse IR source used by Face ID to even out ambient lighting) and a dot projector (a structured-light source for depth). Of these, only the flood illuminator is useful here — it produces a uniform IR field rather than a structured pattern.

The TrueDepth front camera's image sensor includes an IR-cut filter, but the cut is imperfect and a measurable fraction of 940 nm light passes through and registers, particularly in the red channel. In practice, in a dark room with the flood illuminator active, the front camera produces a usable, low-noise grayscale-tinted image of a subject within roughly 1–1.5 m, more than sufficient for distinguishing sleep position, head orientation, and major body movement at the resolution SleepCam actually needs (≤ 1280 × 960).

The flood illuminator is not directly exposed as a configurable AVFoundation device. However, it is implicitly activated whenever the system uses the TrueDepth depth pipeline (e.g., a session configured with depth output) or whenever certain configurations of the TrueDepth front camera's auto-exposure compensate for low light. SleepCam configures an AVCaptureSession with the front builtInTrueDepthCamera device, attaches an AVCaptureVideoDataOutput, and uses session and device configurations that result in the flood illuminator being on during capture. The user is informed of this behavior and consents to it via standard iOS camera-permission prompts.

Reproduction notes

2. Silent capture by retrieving sample buffers from a video data output instead of using the still-photo capture pipeline

Problem

iOS, by regulatory default in some locales (notably Japan and Korea) and by user-perceptible default elsewhere, plays a synthesized shutter sound whenever AVCapturePhotoOutput's capturePhoto(with:delegate:) is invoked. There is no public API to silence it across all locales. A click every 60 seconds in a dark bedroom would defeat the entire product premise.

Approach

SleepCam does not use AVCapturePhotoOutput at all. The capture session is configured with an AVCaptureVideoDataOutput; sample buffers stream continuously to the application via the AVCaptureVideoDataOutputSampleBufferDelegate on a dedicated dispatch queue. At each capture-interval tick, the application captures the next sample buffer's CVPixelBuffer from the running video stream, converts it to a JPEG (or HEIC) at the configured resolution, and writes it to the on-device session folder. The video output stream is not recorded as a movie — only single frames are persisted.

This bypasses the still-photo audio policy entirely. The capture is silent in every locale.

Reproduction notes

// Pseudocode sketch of the per-tick capture path:
func captureOutput(_ output: AVCaptureOutput,
                   didOutput sampleBuffer: CMSampleBuffer,
                   from connection: AVCaptureConnection) {
    guard captureNowFlag.exchange(false) else { return }
    guard let pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return }
    let ciImage = CIImage(cvPixelBuffer: pixelBuffer)
    let jpeg = ciContext.jpegRepresentation(
        of: ciImage,
        colorSpace: CGColorSpaceCreateDeviceRGB(),
        options: [.quality: 0.7]
    )
    try? jpeg?.write(to: nextFrameURL(for: session))
    // Also write a downsampled thumbnail sidecar at this point — see §5.
}

A timer (a foreground-app DispatchSourceTimer or a periodic scheduler keyed off wall-clock time, not CADisplayLink) sets captureNowFlag at the configured cadence. Between captures, sample buffers from the video output are simply discarded.

3. Foreground-only operation with screen dimmed below human visibility

Problem

iOS does not allow a third-party application to capture from the camera while in the background. Apps that have tried to extend background activity through location-permission keep-alive hacks have been rejected from the App Store. SleepCam therefore must run in the foreground throughout an overnight session. The screen on a 7+ hour session must therefore be effectively invisible — both for the sleeper's comfort and for battery preservation.

Approach

  1. The application requests no background-mode entitlements other than audio (because audio recording, if enabled, must not stop when the device locks the display).
  2. While a session is active, the app sets UIScreen.main.brightness to a very low value (the practical floor on most iPhones is around 0.0–0.01, which on OLED hardware renders effectively black). The original brightness is restored when the session ends or the app is foregrounded for review.
  3. The app sets UIApplication.shared.isIdleTimerDisabled = true so the device does not auto-lock and stop the foreground capture.
  4. The app fills the screen with a near-black UI so any pixels that are technically lit emit minimal light. On OLED displays, a true-black pixel emits no light at all, so the practical screen output during a session is dominated by the few pixels of any visible status indicators.
  5. The user is required to keep the device plugged in, and is told this in onboarding. Continuous capture for 7–8 hours with the display held on is power-intensive even with the screen dimmed.

The combination — accept the foreground constraint, dim the screen, declare only audio background mode, plug in — sidesteps both the OS limitation and the App Store review risk.

4. Apple Health sleep-stage overlay aligned to the photo timeline, with a stage-vs-envelope dedupe rule

Problem

HealthKit's HKCategoryType for sleep analysis can return overlapping samples from different sources. In particular, an Apple Watch typically writes both a broad inBed envelope sample covering the whole night and finer asleepCore, asleepDeep, asleepREM, and awake samples for sub-intervals within that envelope. Naïvely rendering all returned samples produces a band where the broad envelope visually swamps the finer staged data.

Approach

SleepCam fetches all sleep samples in the session window via HKSampleQuery, then runs a two-pass dedupe before rendering:

  1. Pass 1: Keep all "staged" intervals as-is. Staged here means any of asleepCore, asleepDeep, asleepREM, awake.
  2. Pass 2: For each "envelope" interval (e.g., inBed), subtract the union of all staged intervals from it. The remaining sub-intervals — those parts of the envelope not covered by any richer staged sample — are kept as envelope-class intervals. Sub-intervals shorter than a minimum (60 seconds) are discarded as visually noisy.
  3. The combined set (all staged + trimmed envelope) is sorted chronologically and rendered as a horizontal band aligned with the photo-timeline scrubber. Each interval is colored by stage; envelope segments use a dimmer, less saturated color so the eye reads them as "background" relative to staged data.

Reproduction notes

// Pseudocode for the dedupe.
let staged = intervals.filter { $0.stage.isStaged }
let envelopes = intervals.filter { !$0.stage.isStaged }

func subtract(staged: [Interval], from envelope: Interval) -> [Interval] {
    var pieces: [(Date, Date)] = [(envelope.start, envelope.end)]
    for s in staged where s.end > envelope.start && s.start < envelope.end {
        var next: [(Date, Date)] = []
        for (pStart, pEnd) in pieces {
            if s.end <= pStart || s.start >= pEnd {
                next.append((pStart, pEnd)); continue
            }
            if s.start > pStart { next.append((pStart, s.start)) }
            if s.end   < pEnd   { next.append((s.end,   pEnd)) }
        }
        pieces = next
        if pieces.isEmpty { break }
    }
    return pieces
        .filter { $0.1.timeIntervalSince($0.0) >= 60 }
        .map { Interval(stage: envelope.stage, start: $0.0, end: $0.1) }
}

let trimmedEnvelopes = envelopes.flatMap { subtract(staged: staged, from: $0) }
return (staged + trimmedEnvelopes).sorted { $0.start < $1.start }

HealthKit does not reliably report whether read authorization was granted (its authorizationStatus(for:) returns share/write status). SleepCam therefore treats an empty result set as "either no data or denied," and surfaces a soft hint pointing the user at the Health app's privacy settings rather than asserting authorization state.

5. Playback engine for fast-scrub photo timelapse on older iPhones

Problem

A 7-hour session at 60-second cadence produces ~420 photos. At 1× playback speed (one photo per real second), this is fine. At 10× — 10 photos per real second — naïve SwiftUI Image rendering produces visible stutter on iPhone X / iPhone 11-class hardware because SwiftUI re-evaluates view bodies and diffs the view tree at each frame change. The decode-and-display path also dominates the per-frame budget when full-resolution JPEGs are decoded for every step of the scrubber.

Approach

  1. Thumbnail sidecars at capture time. When SleepCam writes a captured JPEG, it also writes a thumbnail-sized version (≤ 256 px on the long edge) into the same session folder. The thumbnail is what the playback engine decodes during fast scrubbing; the full-resolution image is only loaded on a hold or zoom interaction.
  2. Sessions index. A small JSON file (sessions_index.json) at the top of the on-device sessions directory caches enough metadata about each session (id, start time, duration, frame count, cover thumbnail name) to render the home screen instantly without enumerating directories. The index is updated whenever a session ends or is deleted; it is rebuilt from the filesystem if missing or corrupt.
  3. UIImageView via UIViewRepresentable for the playback canvas. SwiftUI's Image diffs at the SwiftUI-tree level and is too slow at 10 Hz on older devices. A UIImageView wrapped in UIViewRepresentable, with the underlying image property assigned directly each tick, bypasses SwiftUI diffing.
  4. Decode via CGImageSourceCreateThumbnailAtIndex with kCGImageSourceShouldCacheImmediately. ImageIO's thumbnail path is faster than full-frame decoding and produces a CGImage already in the cached, render-ready state. This avoids a re-decode the first time the image is drawn.
  5. NSCache with prefetch window. The playback engine maintains an NSCache<NSNumber, UIImage> keyed by frame index. On each tick it prefetches the ±8 frames around the cursor on a background queue. NSCache automatically evicts under memory pressure.
  6. CADisplayLink ticker with skip-on-miss. The playback cursor advances based on wall-clock elapsed time, not frame count. If the next required frame is not yet decoded when its display moment arrives, the engine skips the decode and advances the cursor; the next frame either comes from the prefetch cache or is decoded for its own slot. The result is that playback maintains real-time cadence at the configured speed and gracefully drops frames under decode pressure rather than stalling. Pseudocode:
let displayLink = CADisplayLink(target: self, selector: #selector(tick))
displayLink.preferredFramesPerSecond = 30
displayLink.add(to: .main, forMode: .common)

@objc func tick() {
    let now = CACurrentMediaTime()
    let elapsed = now - playbackStartedAt
    let virtualSeconds = elapsed * playbackSpeed   // 1×, 5×, 10×
    let targetIndex = Int(virtualSeconds * captureRateHz)

    if let image = cache.object(forKey: NSNumber(value: targetIndex)) {
        imageView.image = image
        currentIndex = targetIndex
    } else {
        // Skip-on-miss: do nothing this tick. Prefetcher will deliver soon.
        // (Optionally, on a long stall, fall back to a decoded thumbnail
        // synchronously to avoid a frozen viewport.)
    }

    schedulePrefetch(around: targetIndex, window: 8)
}

6. Audio: continuous recording with volume-only loud-region highlighting

Problem

The user wants to know when something audible happened during the night — a snore, a cough, a partner moving — without the app making any classification claims it can't back up. ML-based snore detection is a meaningfully different product surface and a regulatory minefield.

Approach

If the user enables audio, SleepCam records the entire session as a single AAC file via AVAudioRecorder in a background-capable audio session (.playAndRecord with .mixWithOthers off, .duckOthers off, .allowAirPlay off; route override left to system).

For visualization, SleepCam computes a per-bin RMS amplitude over the recorded file (or a streaming buffer during live recording) at a fixed bin resolution (e.g., 1 bin per 5–10 seconds). Bins exceeding a threshold relative to the session's median amplitude are flagged as "loud regions." The waveform UI renders these regions in red over the otherwise-neutral waveform color. There is no audio classification — no snore/cough/talk detection, no model inference, no labels beyond "loud" and "not loud."

7. Privacy as architecture

SleepCam captures imagery of a sleeping person, often including the user's face. The application contains no networking code beyond standard URL-handling and the user-initiated share-sheet path. There is no analytics SDK, no advertising SDK, no crash-reporting SDK, no third-party SDK of any kind. Photos and audio are written into the application's private container and never read or transmitted by SleepCam itself.

The app does not perform face detection, ARKit face tracking, Vision face APIs, or any biometric extraction. No face template, face mesh, face print, or facial-landmark data is computed or stored. The captured 2D images are written to disk and read back only for display in the on-device playback UI.

This architectural posture is documented in the SleepCam Privacy Policy and is a load-bearing product claim, not just a configuration. It is described here for completeness — privacy as an architectural absence is not, in itself, a patentable invention, but its specific implementation choices (e.g., the deliberate non-use of face-detection APIs in an app whose primary capture surface is a face-pointed camera) are part of the public record as of this publication date.

Combinations to be clear about

The novel value of SleepCam, to the extent novelty is even claimed, lies in the specific combination of the foregoing:

The author considers no individual element above to be novel in isolation — every one is built from public iOS APIs, well-documented signal-processing primitives, and standard data-structure operations. The combination as deployed is published here as prior art so that no party may subsequently obtain patent claims that read on it.


Citation. Huber, R. SleepCam Engineering Notes & Defensive Publication. May 8, 2026. https://sleepcamapp.com/engineering.

Comments or corrections: ryan@huber.llc.