Vibe coded app that's playing around with GDPR data from BeReal
  • PHP 47.5%
  • Python 38.1%
  • CSS 7.7%
  • JavaScript 5.4%
  • Shell 1.3%
Find a file
bc1bb 8a54a43462
All checks were successful
Build / Build and analyze (push) Successful in 36s
Ajouter sonar-project.properties
2026-05-17 12:39:49 +02:00
.forgejo/workflows Ajouter .forgejo/workflows/sonarqube.yml 2026-05-17 12:39:36 +02:00
.gitignore initial 2026-05-11 20:39:01 +02:00
_lib.php not annoying on urls anymore 2026-05-12 01:42:42 +02:00
_stats.php initial 2026-05-11 20:39:01 +02:00
analyze.py initial 2026-05-11 20:39:01 +02:00
CLAUDE.md not annoying on urls anymore 2026-05-12 01:42:42 +02:00
cluster_faces.py update for better quality on face recognition 2026-05-11 20:52:53 +02:00
comments.php not annoying on urls anymore 2026-05-12 01:42:42 +02:00
faces.php not annoying on urls anymore 2026-05-12 01:42:42 +02:00
friends.php not annoying on urls anymore 2026-05-12 01:42:42 +02:00
gallery.php not annoying on urls anymore 2026-05-12 01:42:42 +02:00
img.php initial 2026-05-11 20:39:01 +02:00
index.php not annoying on urls anymore 2026-05-12 01:42:42 +02:00
map.php not annoying on urls anymore 2026-05-12 01:42:42 +02:00
people.php not annoying on urls anymore 2026-05-12 01:42:42 +02:00
README.md not annoying on urls anymore 2026-05-12 01:42:42 +02:00
requirements.txt initial 2026-05-11 20:39:01 +02:00
run.sh initial 2026-05-11 20:39:01 +02:00
sonar-project.properties Ajouter sonar-project.properties 2026-05-17 12:39:49 +02:00
style.css initial 2026-05-11 20:39:01 +02:00
thumb-menu.js initial 2026-05-11 20:39:01 +02:00

bereal-archive

A local, private web viewer for your BeReal data export. Drop it next to the ZIP you got from BeReal Support and you get a real app: a dashboard with stats about your posting habits, a BeReal-style photo gallery, a map of every geotagged moment, and — if you want it — a people page that automatically groups every photo by the faces inside it.

Everything runs on your computer. No data ever leaves your machine.

Why you might want this

You have three years of BeReals locked inside posts.json and a sea of .webp files. BeReal's official viewer is the app itself, and the app shows you basically nothing about your archive as a whole. This little tool turns the export into something you can actually browse, search, and reminisce over.

A few things it can tell you:

  • How many BeReals you've taken, what fraction were late, the median delay between the daily moment and when you actually posted
  • A heatmap of which hours of the day and which weekdays you tend to post
  • Which 8 places you've BeRealed from the most
  • A timeline of every face that appears more than ~4 times across your archive, grouped by person (you can label them: "Mom", "Alex", etc.)

What you need

  • PHP 8.0+ with the GD extension (most macOS/Linux installs have this by default). On macOS: brew install php. On Ubuntu: sudo apt install php-cli php-gd.
  • Python 3.9+ with pip
  • About 300 MB of free disk if you want the face-recognition feature (the model downloads on first run)

Get started in five minutes

1. Get your BeReal export

Go to BeReal Settings → Privacy and PermissionsRequest your data. A few hours later you'll get an email with a ZIP. Unzip it somewhere. You should see files like user.json, posts.json, memories.json, and a Photos/ folder full of .webp images.

2. Drop this folder inside

Copy the whole bereal-archive/ folder anywhere inside the unzipped export. A typical layout looks like this:

my-bereal-export/
├── user.json
├── posts.json
├── memories.json
├── friends.json
├── Photos/
│   ├── post/
│   ├── profile/
│   └── realmoji/
└── bereal-archive/        ← this folder, anywhere inside
    ├── run.sh
    ├── analyze.py
    └── ...

The scripts find the export root automatically by walking up the filesystem looking for a folder that contains both Photos/ and user.json.

3. Start the viewer

cd bereal-archive
./run.sh

Open http://127.0.0.1:8123 in your browser. You should see your dashboard. The gallery, map, friends, and comments pages all work right away — no Python needed.

4. (Optional) Dark-image filter + face overlays

If you want the gallery to skip near-black photos and show how many faces are in each photo, install the lightweight Python deps and run the analyzer once:

pip install opencv-python-headless numpy
python3 analyze.py

This walks every .webp, measures brightness, runs a fast Haar-cascade face detector, and writes cache.json to your export root. It's resumable — ctrl-C and rerun anytime.

On an M1 MacBook Pro, ~2 500 photos take about 3 minutes.

5. (Optional, the fun one) Group photos by person

If you want the People page — where every face in your archive is clustered automatically — install the heavier Python deps and run the clustering pipeline:

pip install insightface onnxruntime scikit-learn
python3 cluster_faces.py

On first run it downloads the InsightFace buffalo_l model (~280 MB). Then it computes a 512-dimensional embedding for every detected face and clusters them with DBSCAN under cosine distance. About 20 minutes for 2 500 photos on a laptop.

Output: faces.json (cluster summaries) and faces_raw.npz (embeddings, resumable cache).

Now the People page in the web UI works. Click a face, see every photo of that person, type a name to label them.

If the result doesn't look right, the script ships three presets and a pile of tunable knobs (see "Tuning face recognition" below). --cluster-only reuses the cached embeddings, so re-clustering is near-instant — tweak until you like the result.

Tuning face recognition

The face-clustering script supports a precision-favoured pipeline you can turn on with one flag:

python3 cluster_faces.py --preset precise

What --preset precise actually does:

  1. Quality filtering. Each detected face is scored 0..1 from detection confidence, relative size, and frontality (how symmetric the eyes/nose/mouth landmarks are). Faces below a threshold are dropped from clustering — backgrounds, half-faces and motion blur stop polluting the result.
  2. Test-time augmentation (TTA). Each face is embedded from both the original and a horizontally flipped image; the two embeddings are averaged. ~2× embedding time, meaningfully more robust.
  3. Chinese Whispers clustering. Instead of DBSCAN, the script builds a similarity graph and propagates labels. This is the de-facto standard for face clustering and handles varying cluster sizes much better than density-based methods.
  4. Merge + reassign. After the primary clustering, any pair of clusters whose centroids are very close gets merged (catches "same person split in two"), and any noise face that's clearly close to an existing cluster gets attached to it (catches "single appearance with high-confidence match").

If you don't want all of it, the pieces are individually toggleable:

# Faster: skip TTA, just try Chinese Whispers + refinement on existing embeddings
python3 cluster_faces.py --cluster-only \
    --algo chinese-whispers --merge 0.62 --reassign 0.55

# Stricter clusters at the cost of recall
python3 cluster_faces.py --cluster-only --min-quality 0.55 \
    --algo chinese-whispers --cw-threshold 0.6

# More lenient — captures more identities, accepts occasional mixing
python3 cluster_faces.py --cluster-only --preset lenient

What each knob means:

Flag Effect Sensible range
--algo dbscan / --algo chinese-whispers Clustering algorithm
--eps DBSCAN cosine-distance epsilon. Lower = stricter 0.350.55
--cw-threshold Chinese Whispers similarity edge cutoff. Higher = stricter 0.450.65
--min-samples Minimum appearances to form a cluster 25
--min-quality Drop faces below this quality (0..1) 0.30.6
--tta / --no-tta Enable / disable flip-augmented embeddings
--merge X After clustering, merge clusters whose centroid sim > X 0.60.7
--reassign X After clustering, attach noise faces whose nearest-cluster sim > X 0.50.6

Quality filtering and TTA both require re-embedding (i.e. dropping --cluster-only), because they change what's stored in faces_raw.npz. The script auto-detects this for TTA and re-embeds for you when you switch it on. The clustering knobs (--algo, --eps, --cw-threshold, --merge, --reassign) can be tuned freely with --cluster-only, no re-embedding needed.

What you'll see

Page What it does
Dashboard Totals (posts, active days, on-time vs. late), median post delay, monthly bar chart, hour/weekday heatmap, retake distribution, top 8 locations
Gallery Every post in BeReal layout (back camera big, selfie inset). Filters: skip dark, dark-only, with-faces, by year. Right-click any photo for a context menu: open back/selfie, swap which is the main image, copy URLs
People One tile per identity, labeled with the size of the cluster. Click to see every photo of that person; type a name to label them
All faces Every photo containing a face, with bounding-box overlays
Map Every geotagged post on a Leaflet map with cluster markers. Click a marker for a thumbnail
Friends Your friends in the order you added them, plus a count of how often you @-mentioned each one in comments
Comments Search and stats over the comments you authored

Hosting it somewhere other than ./run.sh

./run.sh is the easy path: it boots PHP's dev server at the host root and points the docroot at this folder. If you'd rather drop it into an existing web server (Apache + a bereal/ subdirectory of your public_html/, an Nginx reverse proxy at /viewer/, shared hosting, whatever), it just works — every in-app URL is built from $_SERVER['SCRIPT_NAME'] at request time, so the app discovers its own prefix instead of assuming it lives at /.

There's nothing to configure. Place the folder, make sure the JSON files + Photos/ are reachable (anywhere up to 5 directory levels above), and hit the URL.

Privacy, plainly

  • The web server is bound to 127.0.0.1, not your network. Only your own machine can reach it.
  • No analytics, no telemetry, no third-party requests at all — except:
    • Map tiles from OpenStreetMap (only when you open /map.php)
    • Leaflet JS/CSS from UNPKG (also only on the map page)
    • The InsightFace model download from GitHub Releases (only the first time you run cluster_faces.py)
  • Everything else is local-only: image analysis, face detection, clustering, the web UI. None of your photos, comments, friends, or locations are sent anywhere.
  • The generated cache files (cache.json, faces.json, faces_raw.npz, people_labels.json) live in the export root, next to your data, not inside this folder. So when you share this folder publicly, none of your personal data tags along.
  • The included .gitignore blocks every BeReal export file and every generated cache from ever being committed by mistake.

Common questions

Q: Do I have to run the Python parts? No. The dashboard, gallery, map, friends, and comments pages work with the JSON files alone. Skipping analyze.py just means you don't get the dark-frame filter or face counts. Skipping cluster_faces.py just means the People page stays empty.

Q: Why doesn't a photo open when I click it? The PHP server has to be running (./run.sh in this folder). If the page loads but images are broken, you probably launched PHP from the wrong folder — run it from inside bereal-archive/ so the docroot is correct.

Q: The face clustering grouped me, then put one friend across three separate clusters. What went wrong? The default settings are tuned for precision over recall. The fastest fix is to re-cluster (no re-embedding needed) with the merge step enabled, which combines clusters whose centroids are close:

python3 cluster_faces.py --cluster-only \
    --algo chinese-whispers --merge 0.62 --reassign 0.55

If that's not enough, run the full precision pipeline (re-embeds with TTA, takes a while):

python3 cluster_faces.py --preset precise

See "Tuning face recognition" above for the full set of knobs.

Q: It says "BeReal export not found". Either you ran PHP from a folder that doesn't have a BeReal export above it, or your export is missing Photos/ or user.json. Make sure this folder is somewhere inside the unzipped BeReal export.

Q: My export uses different timezones — what's shown in the dashboard? The timezone declared in your user.json is used everywhere times are shown. Falls back to UTC if the field is missing or invalid.

Q: Can I move the bereal-archive folder somewhere else and point it at the export? Not built in. Today it expects to live somewhere inside the export folder. Patches welcome.

Troubleshooting

Symptom Likely cause
error: 'php' is not installed Install PHP first: brew install php (macOS) or sudo apt install php-cli php-gd
Pages load but images are broken PHP started from the wrong folder. Run ./run.sh from inside bereal-archive/
BeReal export not found notice on every page bereal-archive/ isn't inside the unzipped export, or the export is incomplete
analyze.py fails with "Could not load OpenCV Haar cascades" pip install opencv-python-headless (not the bare opencv-python)
cluster_faces.py fails to download the model Check your internet connection on first run only; the model is cached after the first download
Port 8123 already in use Run ./run.sh 9000 (or any free port)

License

MIT. Use it, fork it, share it. If you ship a fork, please keep the privacy posture intact — no telemetry, no remote logging.