Skip to content

Examples

Find duplicate uploads

php
$index = new ImageHashIndex();

foreach ($existingImages as $image) {
    $index->add($image->id, $image->hash);
}

$matches = $index->search($newUploadHash, maxDistance: 3, limit: 10);

if (count($matches) > 0) {
    echo "This upload may already exist.";
}

Search screenshots

php
$index = ImageHashIndex::load('/data/screenshots.idx');

$matches = $index->search(
    $targetHash,
    maxDistance: 8,
    limit: 50
);

foreach ($matches as $match) {
    echo "Possible match: {$match['id']} distance {$match['distance']}\n";
}

Image Moderation

php
$bannedIndex = ImageHashIndex::load('/data/banned.idx');

$matches = $bannedIndex->search(
    $uploadedImageHash,
    maxDistance: 5,
    limit: 10
);

if (count($matches) > 0) {
    echo "Upload requires moderator review.";
}

Useful for:

  • detecting previously removed images
  • identifying slightly modified reposts
  • forum and community moderation
  • reducing manual review workload

What this means

Even if a user crops, resizes or slightly edits an image, perceptual hashing can often identify that it is visually similar to content that has already been reviewed or removed.

Choosing maxDistance

Start strict, then loosen.

text
0-2     near exact
3-5     very similar
6-10    similar enough to review
10+     expect noise

The best threshold depends on your images. Screenshot datasets may behave differently to photos.

Native tools, weird experiments, and practical performance work.