I don't think I have the exabyte-scale capacity :)
AFAIK this is not mandatory, cluster with capacity of dozen of petabytes
should be fine for the initial testing and learning :-)
Are there any recommendations / patterns on how Ceph should be used
to
make better use of its features ?
you can find some general performance tuning tips like [1], but I'm not aware
of any recommended usage patterns. However, I'm Ceph beginner, so maybe it's
just my ignorance.
As for ceph-ispn specifically, I'd like to learn more about CRUSH algorithm
and CRUSH map options [2] if it would be possible to map ISPN segment to
specified Ceph primary OSD, which would allow us to run ISPN node and it's
appropriate primary OSD on the same machine (similar thing we do in ISPN-Spark
integration), which should result into better performance.
[1]
http://tracker.ceph.com/projects/ceph/wiki/7_Best_Practices_to_Maximize_Y...
[2]
http://docs.ceph.com/docs/jewel/rados/operations/crush-map/