RC RANDOM CHAOS

Identifying a Saudi desert fossil with PCA on a shell shape dataset

· via Hacker News

Original source

I found a seashell in the middle of the desert

Hacker News →

A hobbyist found a seashell-shaped rock at the base of a cliff in Saudi Arabia’s Alghat desert, 500 km from the nearest coast. The region’s carbonate rocks and marine fossils trace back to the late Jurassic, when parts of the Arabian Peninsula were submerged, making the find geologically plausible if visually startling. Lacking access to a paleontologist, the author decided to identify the fossil purely by morphology using a public dataset of 7,894 shell species and nearly 60,000 images.

The pipeline normalized each shell for position, scale, and orientation, extracted a 256-point contour, and treated the result as a vector in 256-dimensional space with squared Euclidean distance between shells. PCA compressed this down to two components capturing 67% of variance: PC1 corresponded to pointiness, PC2 to vertical symmetry. Plotting the dataset revealed that round shells cluster tightly and symmetrically while pointy shells occupy a broader, rougher region of the latent space.

The nearest neighbor to the Alghat fossil turned out to be Sphincterochila candidissima, a species whose oldest fossils are only about 38 million years old — far too recent to be a direct match. The author treats the resemblance as a likely case of convergent evolution rather than lineage, and notes the obvious caveat that shape alone is a weak signal for taxonomy. The accompanying interactive tool lets readers locate other shells in the same latent space.

Read the full article

Continue reading at Hacker News →

This is an AI-generated summary. Read the original for the full story.