SCAN email Announcements

SCAN Update March 30, 2020

Data served by SCAN is dynamic, most data records and images need to be vetted, possibly corrected or simply appended with additional information. Anyone can participate in crowdsourcing to increase the quality and quantity of data served by any data provider as long as they have an internet connection and obtain permission from the data provider. The two most obvious crowdsourcing activities are transcription of label images and georeferencing

Transcription of label images: Some collections image the labels, upload to SCAN and create a There are two primary means for transcribing and annotating data label information virtually. The most common method is to submit a project to Notes from Nature, and volunteers will transcribe label images.  You can also crowdsource your work within Symbiota if you have a Live collection. In the case of Notes from Nature, the transcribed records are returned to the data provider, who then inserts the records into their database. For Symbiota crowdsourcing, all of the transcribing is performed in SCAN. Below is an example of a visual guide Sarah Bush is using to help her students transcribe slides.

The only collections that are mass imaging and crowdsourcing label images are the University of Florida (LepNet, Kaminsky, Laurel lkaminsky@floridamuseum.ufl.edu);  the University of Utah (ParasiteTracker, Sarah Bush dovelouse@gmail.com ) and the University of Arizona (SCAN, Gene Hall wehall@EMAIL.ARIZONA.EDU ). If you want to help process records please contact the respective data managers and they can provide you with institution-specific guides. I will also post documents SCAN WordPress . Laurel Kaminsky and Sarah Bush have provided documents that describe in general how to participate in crowdsourcing their collection data in SCAN.

Georeferencing: There are over 3 million records on SCAN that need to be georeferenced and probably over 1 million records that need to be corrected or have more precise coordinates applied. There are three ways of helping, download records and use GeoLocate or some other tool to apply coordinates, work within SCAN on specific collections and perform batch georeferencing, or vet records after batch georeferencing through Yale (Nelson Rios and Larry Gall). All three options require that you work with individual data providers to establish a protocol. If your own collection needs georeferencing then why not start there.  Let me know if you need help getting started.

 
SCAN Update  March 23, 2020

SCAN Moved to ASU SCAN is up an running on ASU, but the URL everyone uses to access SCAN is the same https://scan-bugs.org .  Please avoid downloading large data sets or performing extensive queries for this week.  There will be a scheduled maintenance shut down starting 6 pm on Wednesday, March 25th as ASU research computing performs networking changes. SCAN is likely to be down for only a few hours but it may extend to the 26th. As always let me know if something is not working. For reference, I keep a list of SCAN issues at https://github.com/scan-bugs-org/scan/issues

Bee Holdings Survey  I have been methodically sending out the survey of bee holdings to individual collections, in part to keep track of responses.  However, I assume most collections are closed for the near future and so I am sharing the current version to everyone (attached).  Please find your collection in the “Drawers” tab and fill it out as best you can. I included 19 collection responses if you want to know how other collections responded. The survey is based on drawers, if you have specimen level estimates just clarify in the notes row.  Please return the survey to me, I will post a community version at the end of April and you can revise any of your estimates or add estimates if you did not get a chance to respond before.  

Building an Arthropod Collections of the World Map/Database Evin Dunn has started the website extension of the North American collections database and map to included all arthropod collections in the world https://bug-collections.org/. In the next couple of months, it will become fully functional and will be like SCAN in that any collection manager can edit information about their collection.  It will be set up where GBIF simply harvests collection information as they do with SCAN when a collection is registered on GBIF. I expect it to also be linked to GrSCiColl and the planned “Collections Catalogue”. We will submit an abstract to the upcoming Digital Data Conference.

 

SCAN Update  February 27, 2020

SCAN Moving We are setting up an instance of SCAN on an ASU server and if everything works out this will become the long-term home for SCAN. Chris Jordan at TACC has been unbelievably supportive and is still willing to host SCAN on a UTexas TACC server. After we go through a few weeks of testing I will let everyone know about the move.  I expect it to be flawless and we will maintain the same url (scan-bugs.org). In this process, we will test adding non-arthropod taxonomy tables to help people enter biotic association data.  We plan to replace the indexing software SOLR with Elasticsearch by the summer, but we may try to just run SCAN without SOLR.  We think SOLR is the primary cause of most performance problems we have experienced in the last several months. 

ADBC Funding  We are planning one more comprehensive digitization project, the “North American Bees”. Katja Seltmann will be taking the lead on the project. We still have to find out if NSF-ADBC or some other program would fund the project but we will make a strong case. NSF announced last fall that they do have $10 million in funding that can be used to support projects starting in 2021, they were not sure how the funding would be targeted. 

Focusing on a North American Bee Network will be unique in that it would generally complete digitization for an ecologically important group and the data would important to conservation, education-outreach as well as for research in evolutionary ecology and data science. There are already 2,767,952 North American bee records that include data for most of the 4,000+ bee species in North America. Only butterflies have more records per species than bees, but we need to triple the number of bee records to have enough data to comprehensively address questions above the species level. We have identified over 70 collections that could collaborate on a digitization effort. Additionally, SCAN partnered with a recently funded USDA project to develop a national bee research coordination network (Hollis Woodard, UC-Riverside; Bryan Danforth, Cornell and others).  If you are interested in participating please let me know.

GBIF goal  Most of the active collections in North America have registered with GBIF, we are up to 77 collections in the US. Let me know if you want to register, it is the easiest and best single thing you can do for your collection to be recognized. I hope to add 10-15 more collections by the end of 2020. 

SCAN Update  October 13, 2019
  1. I have added people working on Parasite Tracker Thematic Collections Network to this list if they were not already on the list.
  2. The batch image uploader https://scan-bugs.org/portal/imagelib/imagebatch.php has been working for images that link to existing records. However, It is still undergoing an update to allow images to create skeletal records and also minimize the amount of steps an end user needs to perform.  It should involve simply selecting your collection and uploading a zip file containing images. It will be ready by the end of this week.  We have removed links to the iDigBio uploader so that will not be confusing in the future. 
  3. Chuck Sexton is conducting a review of the distribution and identification of moths in the genus Petrophila (Lepidoptera: Crambida) in North America.  There is a need for additional digital images of specimens and/or live individuals not heretofore uploaded (as of Sept. 15, 2019) to the following online sites: BugGuide, iNaturalist, Flickr, Moth Photographer’s Group.  Particular gaps in coverage include records of Petrophila in the following areas:

— Canadian Prairie Provinces

— Southeastern U.S. States from Louisiana and Arkansas east to the Carolinas

If you have records and can image specimens please let him know Chuck Sexton gcwarbler@austin.rr.com .  Then upload images to SCAN and he can use them.  He also works on Cisthene and is happy to review anyone’s records/images for that genus.

  1. The LightningBug proposal to NSF will be submitted today. It will increase rates and quality of transcription and specimen imaging. We will have a dedicated project website up and running in November.  If you are planning to submit a CSBR grant in 2020 you should seriously consider incorporating LightningBug workflows if digitization is part of the project. Four arthropod ADBC PEN grants were submitted last week, 2 LepNet, 1 SCAN and 1 Parasite Tracker.  Fingers crossed!
  2. We are now actively exchanging information with BOLD and I will set up a separate BOLD group. The purpose is to increase the capacity for users to share data with BOLD and I hope it also promotes more collections engaged in genetic digitization. Please let me know if you submit records to BOLD and want to be a part of the conversation. Wendy Moore is the SCAN user that is championing this project.
  3. We are still experiencing some lingering speed problems related to SOLR, if you are experiencing problems email me, we have been able to solve problems pretty quickly. The one annoyance that might be with us till January is the batch record uploading procedure, where the final step in transferring records appears to take a long time.  It actually transfers records faster but still indicates it is transferring records.
  4. The first of two reviews on North American arthropods and arthropod collections should be out in PeerJ before the end of the year. In the process of finishing the second part we will complete a data quality assessment of all records. If you want to participate in some global updates (e.g. inserting country, datum etc.) let me know.  If you want to know how compliant your data is and SCAN serves your collection data to GBIF and iDigBio there are two links on your collection page that takes you to GBIF and iDigBio, where you can observe your data quality red flags.  Over the next six months GBIF will look for ways to make it easier to identify and correct mistakes for all data providers.  In developing solutions the first option is always proposing something that works for everyone, not just Live collections in SCAN.
  5. I started to retain SCAN updates on the SCAN email Updates page for easy reference.
SCAN Update  September 24, 2019

I am converting the LepNet WordPress site https://lep-net.org  back to a SCAN data portal support site https://scan-all-bugs.org/.

The website has individual tabs for TCN-specific information related to any of the three Thematic Collections Networks (LepNet, or the original SCAN TCN) and the new TCN Parasite Tracker.  Information on the site will also be relevant to InvertEBase arthropod collections 

In addition to the new Parasite Tracker TCN a LepNet PEN was awarded to San Diego Natural History Museum we finally had a LepNet PEN grant funded.  The description of the project by San Diego Natural History Museum is found here .  It is a great addition to LepNet and will be focusing on Baja California leps.

The Parasite Tracker TCN will strongly emphasize biotic associations between arthropod parasites and their vertebrate hosts and so Jorrit Poelen has been developing links with various data providers to capture association data and share with GloBi.  Katja Seltmann will be leading a special session at the ADBC summit next week to further discuss how we effectively capture association data and share it. The broader impacts of this work will be appreciated by anyone working with arthropods that have important biotic associations (e.g., pollinators).

We finally put up the first draft of the Arthropod Collections Map , and depending on funding we will develop this into an arthropod collections index project comparable to Index Herbariorum and provide basic information about each collection and eventually provide a global extent.  But until then feel free to check out the clickable map and see if we correctly placed your collection on the map and linked to your website.

I am meeting with GBIF tomorrow to discuss performing a batch registration with institutions/collections that have not yet registered. So if you are not registered with GBIF I will be contacting you about participating in this process.

 

SCAN Update  June 26, 2019

It is gratifying to receive recent news that at least three pots of funding will keep SCAN going for at least four more years with no other support.  However, in the short term that does not help people that have been experiencing  difficulties with SCAN  in the past few weeks.  Although we tested SCAN on TACC for six months prior to the move we continue to have problems we did not expect.  Most of these are related to SOLR, especially where it takes SOLR too long to index and slows the whole system down.  SOLR is no longer being supported and so its “bugginess” gets worse with time.  It will be replaced by elasticsearch, and I hope that happens this fall.  There have been other quirks and Evin will address those in due course.  We have the added challenge that Evin has two other projects that he is committed to and he is getting married in a month.  In particular, we will only have emergency support during July 28 to August 8 while he is on his honeymoon.  He hopes to have all the kinks worked out before he leaves.

Thanks for being patient, I apologize for any inconvenience people have endured. – Neil

Comments are closed.