-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Data Explanation #15
Comments
Are you sure 01-04 actually prefer active neurons, or is this just relative to 00? In my experience, most people would manually choose their cells on the mean image, and make sure they have some activity. However, due to neuropil contamination, everything selected on the mean image has activity anyway. |
The methods for 01-04 vary, as far as I understand it, but all include time series information in one form or another. The harvey lab datasets for example use no information from the mean-image in detecting cells, so it is impossible for it to select for completely inactive cells. We use an optimized spectral clustering approach on very small spatial windows, and manual annotators adjust clustering parameters in real-time accompanied by viewing of resulting traces and neuropil subtraction with an individually-fit linear model. Thus the manual annotator never 'draws' ROIs on any image, and we do not falsely attribute a neuropil signal present at inactive cells for cellular activity. However the manual annotator knows where cells are likely to be and what they should look like, so this will guide them in cell selection compared to an unsupervised approach. We think we are essentially looking at the same information as the factorization-based approaches, but doing an inordinate amount of manual adjustment and fine-tuning in real-time for every cell, which would be fantastic to automate away! I think some datasets are more what you're talking about though, involving manual 'circling' of cells on some kind of image that may include pixel-pixel correlations. I think a good explanation for the 'truth' for each dataset would be very useful as we see how algorithms match up to these various truth definitions. Glad to provide more specific info if it'd be useful for the project. |
Add in clear test data preferences--00 includes inactive neurons and 01-04 prefer active neurons
Add in definitions of recall/precision/inclusion/exclusion as well as predictions for the above differences in data: For algorithms that prefer active neurons, best results expected 01-04, low recall but high precision on 00. For algorithms that prefer inactive neurons, best results expected on 00, high recall but low precision on 01-04.
This will hopefully encourage labs to submit their algorithms even if they are not the most successful because no algorithm is ideal across all of the data sets provided. Additionally, enable labs to post an explanation of their results so they can make themselves look good (and make sense of their results).
The text was updated successfully, but these errors were encountered: