#introductions Hi, I'm an astronomer and data scientist at the University of Washington. I work on statistical methods for astronomy, including black holes, asteroids and the sun. I also play harp and take pictures of things (mostly waterfalls and squirrels).
@tiana_athriel in all seriousness what does that statistical model look like? Are you looking at the a matched filter then using correlation/convolution to look for repeated signals?
@fullywoolly Right now I work a lot with Gaussian Processes. They're magic! :) I have a lot of non-stationary, unevenly sampled time series that may or may not have a periodic signal. Since I don't have *that* many data points, Gaussian Processes are perfect for that task!
@tiana_athriel they do sound magic! Everything I've had to process in an academic setting has either a well-defined system or signal. It is hard to imagine that you can get much of anything without having either of those things! Space is pretty good medium for transmitting light, but over such vast distances the constructive and destructive interference must be brutal.
Important questions, python or Matlab/Octave?
@fullywoolly Not really that much interference, mostly just gas and dust in the way that messes things up. Astronomy data is pretty clean compared to other fields, and the problems are often well-bounded (compared to, say, ecology). I do everything in Python (yay open source!), but also know people who write R, C++ and, of course, the physicists' favourite: Fortran. Matlab/Octave is much rarer.
@fullywoolly A lot of statisticians write R, so if you're interested in the newest and shiniest of statistical models, sometimes that's where you have to go.
And sometimes you just have algorithms that are too slow to compute in pure python, even with numpy and scipy.
I'm currently learning Vega/Vega-Lite and the python package Altair for #visualization, which I find much more sensible and fun than matplotlib (though I still do a *lot* of visualizing in that, too)
@tiana_athriel makes sense about the stats people using R.
numpy is already compiled/optimized right? I can understand if you are I/O bound like thrashing the hard disk by reading/writing too frequently or saturating the bus going to memory. That is where languages like C or C++ will be stronger.
Vega looks nice. Would've helped with my last project at work. I haven't tried using tools like that or Pandas. Even though my dataset was coming from a db. I just didn't think about it.