LmCast :: Stay tuned in

I found a seashell in the middle of the desert

Recorded: May 31, 2026, 8:04 a.m.

Original Summarized

GitHub - Hawzen/I-found-a-seashell-in-the-middle-of-the-desert · GitHub

Skip to content

Navigation Menu

Toggle navigation

Sign in

Appearance settings

PlatformAI CODE CREATIONGitHub CopilotWrite better code with AIGitHub SparkBuild and deploy intelligent appsGitHub ModelsManage and compare promptsMCP RegistryNewIntegrate external toolsDEVELOPER WORKFLOWSActionsAutomate any workflowCodespacesInstant dev environmentsIssuesPlan and track workCode ReviewManage code changesAPPLICATION SECURITYGitHub Advanced SecurityFind and fix vulnerabilitiesCode securitySecure your code as you buildSecret protectionStop leaks before they startEXPLOREWhy GitHubDocumentationBlogChangelogMarketplaceView all featuresSolutionsBY COMPANY SIZEEnterprisesSmall and medium teamsStartupsNonprofitsBY USE CASEApp ModernizationDevSecOpsDevOpsCI/CDView all use casesBY INDUSTRYHealthcareFinancial servicesManufacturingGovernmentView all industriesView all solutionsResourcesEXPLORE BY TOPICAISoftware DevelopmentDevOpsSecurityView all topicsEXPLORE BY TYPECustomer storiesEvents & webinarsEbooks & reportsBusiness insightsGitHub SkillsSUPPORT & SERVICESDocumentationCustomer supportCommunity forumTrust centerPartnersView all resourcesOpen SourceCOMMUNITYGitHub SponsorsFund open source developersPROGRAMSSecurity LabMaintainer CommunityAcceleratorGitHub StarsArchive ProgramREPOSITORIESTopicsTrendingCollectionsEnterpriseENTERPRISE SOLUTIONSEnterprise platformAI-powered developer platformAVAILABLE ADD-ONSGitHub Advanced SecurityEnterprise-grade security featuresCopilot for BusinessEnterprise-grade AI featuresPremium SupportEnterprise-grade 24/7 supportPricing

Search or jump to...

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback


We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Cancel

Submit feedback

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Cancel

Create saved search

Sign in

Sign up

Appearance settings

Resetting focus

You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.
You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

Hawzen

/

I-found-a-seashell-in-the-middle-of-the-desert

Public

Notifications
You must be signed in to change notification settings

Fork
1

Star
85

Code

Issues
0

Pull requests
0

Actions

Projects

Security and quality
0

Insights

Additional navigation options

Code

Issues

Pull requests

Actions

Projects

Security and quality

Insights


Hawzen/I-found-a-seashell-in-the-middle-of-the-desert

 mainBranchesTagsGo to fileCodeOpen more actions menuFolders and filesNameNameLast commit messageLast commit dateLatest commit History86 Commits86 Commitsmarinemacro_figuresmarinemacro_figures  mediamedia  publicpublic  srcsrc  toolstools  .gitignore.gitignore  .nojekyll.nojekyll  MakefileMakefile  README.mdREADME.md  index.htmlindex.html  notes_writeup.mdnotes_writeup.md  package-lock.jsonpackage-lock.json  package.jsonpackage.json  requirements.txtrequirements.txt  tsconfig.jsontsconfig.json  vite.config.tsvite.config.ts  View all filesRepository files navigationREADMEI found a seashell in the middle of the desert
To my amazement, I found a fully solid rock that eerily resembles a seashell at the base of a cliff in the Alghat desert, Saudi Arabia. I didn't know what to make of it at first, it had the swirls and shape of a seashell but was fully a rock, more importantly, it shouldn't be here; the nearest coastline is Dammam's, 500 km away.

This looks impossible

Carbonate rocks (e.g. limestone), marine fossils, coral fossils, and sedimentary structures (like ripples or bioturbation) all exist in and around Alghat, which points to the fact that parts of the Arabian Peninsula were once submerged under the sea. Specifically in the late Jurassic age (~150 million years ago)[1].

Stratigraphic distribution figure of areas near Najd[1]

Nevertheless, I was still super curious about the fossil I found; what animal inhabited it? what did it look like back in the Jurassic age? any modern relatives or lookalikes?
The proper way of answering these questions is to conduct a detailed analysis of the fossil (e.g. via inspecting the sediment it was found in, its shape, etc.), this should be done by an expert paleontologist. However, I know no paleontology, or any paleontologist, so I figured I could DIY it myself (how hard could it be..?), though I'll do it strictly via its shape — or what's called its morphology. Morphology alone is probably not accurate enough to discern lineage as different species might lookalike but are from different lineages, so this is probably not the best way to do it, but it sounded fun and intuitive, so I gave it a try.
Concretely, I plan on:

Mathematically representing the shape of a shell
Defining a distance metric between shapes (so that I can find shells similar to the fossil's)
Mapping out the space of shapes

7894 different species and 59244 images of shells were in the Zhang, et al. shell dataset[2]; good enough for me!
Capturing 'shape' is actually a very hard problem; any object can be rotated by pitch, yaw, roll, scaled, and translated. Before starting any statistical analysis, I followed a guideline to isolate the shape from other factors

The shell must be centered to the midpoint of the picture
The scale of the shell must be equivalent across all images (specifically, the maximum distance from the origin is 1)
Orientation is the hardest part

Pitch and yaw can be fixed by only choosing samples where the shell's opening is facing the camera. This is not perfect, but I found the dataset to be pretty consistent with its angles
Roll is difficult. A shell can be rotated in any way around the axis (even whilst the opening is facing the camera). My fix was to use the longest radius as the reference point, and rotate the shell so that the longest radius is always on the right. This is not perfect either, but it was good enough for me.

Then, I extracted the contour of the shell to 256 points relative to the center. This way, each shell is represented by a 256x2 matrix, where each row is the (x, y) coordinates of a point on the contour. Example:
> contours[0].shape

(256, 2)

> contours[0].tolist()[:5]

[-0.38561132550239563, 0.9804982542991638],
[-0.4204626679420471, 0.9785506725311279],
[-0.4553140103816986, 0.976603090763092],
[-0.4901654124259949, 0.9746555089950562],
[-0.5230183005332947, 0.9685550928115845]]

Normalization pipeline

Naturally, the distance between two shells s1 and s2 is squared euclidean distance between their contour points:
$$
d(s1, s2) = {\sum_{256} (s1.x_i - s2.x_i)^2 + (s1.y_i - s2.y_i)^2}
$$
Representing the space will require 256 dimensions, which is a little more than just the 2 I need to plot it over x and y. Given the normalized shell contour above, it's clear that many of these dimensions are redundant (for instance, the space of all possible 256 contour points allows intersection, while the space of possible shells doesn't, AFAIK), so the space of possible shells can be condensed into a smaller latent space. To drive my point home, I'll show three examples of fully random contours (i.e. pseudo-random points around the origin).

Probably not a real shell

Dimensionality reduction techniques map the original 256 dimensions onto a smaller number of dimensions (e.g. 2 or 3) while trying to preserve the distance between shells as much as possible. One such technique I'll be using is Principal Component Analysis (PCA). Here's an excellent fragment that explains how PCA works: https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues/140579#140579.
After applying PCA, I retained 56.50% of the variance using only the first principal component (PC1), and 67.25% using the first two. This means we can describe a shell's shape by only two numbers, and be pretty close to the original shape!
The interesting part is trying to understand what these two numbers mean; dimension 1 in the original 256-dimensional space annotates the location of the first contour point of the shell, whereas dimension 1 of the latent space annotates a high-level feature, learned by the PCA algorithm. We can visually try to understand what PCA dimension PC1 represents by finding two shells, diametrically opposite in the PC1 dimension, yet similar in all other dimensions.
Essentially, we want to find two shells i and j such that the following score is maximized:
$$
\text{score}(i,j) =
\frac{|z_{i,1} - z_{j,1}|}
{|\mathbf{z}_{i,2:k} - \mathbf{z}_{j,2:k}|_2}
$$
PC1 seems to capture the 'pointiness' of the shell, i.e. more than 50% of variance in shell shapes can be explained by how pointy they are. PC2 seems to capture the symmetry of the shell, or perhaps the mass distribution over the vertical axis. I'll leave the interpretation of the other dimensions as an exercise for the reader (I have no idea).

And now for the grand finale, we can plot the shells in the latent space, and see where our Alghat fossil fits in it. But first, for dramatic tension, I will discuss the plot.
The plot represents PC1 on the x-axis and PC2 on the y-axis, while color represents the roughness of a shell (computed as the difference in slope between consecutive points). The following observations are worth noting:

Negative PC1 values (representing roundness) are way more common than positive PC1 values (representing pointiness). Yet roundness is less diverse and occupies less space than pointy shells
Pointy shells seem to be way more rough than round shells
Negative PC1 values always have PC2 values close to zero; no shell in the dataset has a round but asymmetric shape. Below, I will project those shells back from latent space to the shape space, imagining impossible shells

Map of shell latent space with example shells

Modifying Principal Components against the mean shell

Projecting 'impossible' shells

So, what shell most closely resembles our Alghat fossil? It's Sphincterochila candidissima (try to pronounce it). However, it is really young, nowhere near the Jurassic age; instead, the earliest fossil of it dates back 38 million years ago[4]. Ultimately, shape is not the best way of determining shell lineage, but its eerie similarity to the Alghat fossil is still fascinating, and perhaps points to some sort of convergent evolution, where two different species evolve to have similar shapes due to similar environmental pressures.

Left: Alghat fossil compared, Right: Sphincterochila candidissima[3]

Explore the tool
Feel free to explore the tool and try to figure out where a shell of your choice fits in the shell latent space!
https://shell.hawzen.me

References

Aba Alkhayl, S. S. (2022). Marine macro-invertebrate fossils from the Lower Hanifa Formation (Hawtah Member), central Saudi Arabia. Arabian Journal of Geosciences, 15, 1410. https://doi.org/10.1007/s12517-022-10581-w
Zhang, Q., Zhou, J., He, J. et al. A shell dataset, for shell features extraction and recognition. Sci Data 6, 226 (2019). https://doi.org/10.1038/s41597-019-0230-3
https://en.wikipedia.org/wiki/Sphincterochila_candidissima
Tracey, S., Todd, J. A., & Erwin, D. H. (1993). Mollusca: Gastropoda. In M. J. Benton (Ed.), The Fossil Record 2 (pp. 131–167). London: Chapman &

About

shell.hawzen.me/

Resources

Readme

Uh oh!

There was an error while loading. Please reload this page.


Activity
Stars

85
stars
Watchers

1
watching
Forks

1
fork

Report repository

Releases

1
tags

Packages
0

 

 

 

Uh oh!

There was an error while loading. Please reload this page.


Contributors

Uh oh!

There was an error while loading. Please reload this page.


Languages

Jupyter Notebook
57.0%

JavaScript
27.4%

TypeScript
11.5%

Python
2.4%

CSS
1.7%

Makefile
0.0%

Footer

© 2026 GitHub, Inc.

Footer navigation

Terms

Privacy

Security

Status

Community

Docs

Contact

Manage cookies

Do not share my personal information

You can’t perform that action at this time.

The repository details an exploratory process undertaken by the author to hypothesize the identity of a rock found in the Alghat desert that exhibits morphological features resembling a seashell, based on a self-directed analysis of shape rather than formal paleontology. The context for the discovery places the area within a geological framework suggesting past submergence of the Arabian Peninsula, as carbonate rocks and marine fossils exist in the region, dating back to the late Jurassic period approximately 150 million years ago, citing references to the Alghat region.

Since direct paleontological analysis was not feasible, the author chose to use computational geometry and dimensionality reduction techniques to analyze the shell's morphology. This involved capturing the shape of the fossil by extracting 256-point contours from images of shells sourced from a dataset by Zhang et al. The process required careful normalization to manage rotational variability; specific guidelines were established to fix pitch and yaw by selecting images where the shell opening faced the camera, and to manage roll by aligning the longest radius along a defined axis.

The spatial relationship between shells was quantified using the squared Euclidean distance between their contour points. This resulted in a high-dimensional space of 256 dimensions, which was deemed too complex for direct visualization. To condense this information, the author employed Principal Component Analysis (PCA) to reduce the data into a lower-dimensional latent space while attempting to preserve inter-shell distances. The analysis revealed that the first principal component (PC1) explained 56.50% of the variance, and the first two components (PC1 and PC2) collectively explained 67.25% of the variance, suggesting that shell shape can be effectively described by two parameters.

The interpretation of these latent dimensions revealed meaningful physical properties. PC1 was interpreted as capturing the 'pointiness' of the shell, while PC2 seemed to represent the shell's symmetry or mass distribution relative to the vertical axis. Furthermore, the visualization of shells in this latent space, color-coded by calculated roughness (the difference in slope between consecutive points), provided further insights. Observations indicated that rounder shapes (negative PC1 values) were more common than highly pointy shapes (positive PC1 values), and pointy shells tended to exhibit greater roughness.

Ultimately, the analysis led to a visual comparison between the shape of the Alghat fossil and known species, suggesting a resemblance to Sphincterochila candidissima; however, the author notes that this morphological similarity is not sufficient to determine lineage, hinting at processes of convergent evolution where different species may evolve similar shapes under similar environmental pressures. The exploration establishes that while shape is a fascinating descriptor, it is not the definitive method for discerning evolutionary relationships.