(let's assume that there's no weirdness where grains makes your stomach feel worse for the next 2 weeks or something. In fact, let's forget about the fact that these are time series data at all. This might be a minor or a major statistical transgression, I don't know.)
I think the test to use here is a Student's t-test (AKA just "t-test"). Particularly, an independent 2-sample t-test with equal variances (assumed) and unequal sample sizes. (side note: I also remembered z-tests and ANOVAs, and it looks like:
- a z-test is a simpler test that you can use when you're just trying to know if a sample that you've taken is significantly different from the group as a whole. For example, if I knew my "stomach" value for every day of my life (or even if I just knew the mean and standard deviation), I could use a z-test to tell if those 19 no-grains days were unusually high/low stomach-value days compared to all 9038 days of my life.
- an ANOVA is a generalization of the t-test. Specifically, the one-way ANOVA is something you can use to compare >2 means. For example, if I tried "no grains" for two weeks, then "no meat" for two weeks, then "no coffee or peanuts" for two weeks (clearly this would be the hardest), I could compare my stomach value for all of those.
- if "ANOVA" wasn't scary enough, there's also things called "ANCOVA" and "MANOVA"; the latter makes me snicker every time.)
Okay! Let's do some t-testing! In one corner, the stomach (, mood, energy) values of my grains days. In the other corner, the stomach (, mood, energy) values of my no-grains days. Which Is Bigger?
t-test on mood before and after grains:
t = -0.78694967152, p = 0.439345514713
t-test on energy before and after grains:
t = -0.961369818324, p = 0.346365085994
t-test on stomach before and after grains:
t = -1.98753017649, p = 0.0588975495116
Again, the p-value is the one that tells you if there's anything going on. Small p-value means there's a small chance that this effect could have happened by chance. Looks like I can't say anything about whether grains effect my mood or energy. But p=0.058 is pretty small! (traditionally 0.05 is the threshold for caring about p-values, at least in psych) And surprisingly so. I couldn't have told you that from the graph.
Let's look at the data again: (check it out, I'm learning python string formatting)
stomach with grains: ['3.10', '2.92', '2.78', '2.83', '2.36', '2.75', '3.29']
mean = 2.86
stomach without grains: ['3.33', '3.67', '2.89', '3.37', '3.31', '3.60', '3.00', '2.87', '2.59', '2.80', '2.92', '3.11', '3.22', '3.63', '3.47', '3.29', '2.89', '2.67']
mean = 3.15
Huh! That is interesting. Now before you jump to conclusions, note a few things:
- this is self reported, not double blind, not even single blind. (although in this case I'm the experimenter and the subject, so single blind = double blind; and it'd be really hard to make this experiment blind.)
- data was gathered "as I feel like it" AKA whenever I use my phone.
- I cut out 3 days' worth of data because they each had only one sample.
- "days" were split at midnight, even though I usually had one or two points after midnight; I should probably split them at about 3 or 4AM.
- I didn't just cut out grains. I also minimized added sugar (how well? dunno) and added more meat.
- p = 0.058. That's really borderline significant. It could just be a fluke.
- I didn't say what I'd look for before doing the experiment. Why is this such a big deal? Well, p=0.058 means that even if grains didn't matter, 5.8% of the time such an effect could have happened just by chance. Which means that if I tracked 20 variables, I'd find one that "looked significant".
But hey, exploratory pilot study: super success! I think that my energy and my mood are pretty similar, and I think that maybe grains make my stomach feel worse although I'd need to study it again to tell for sure. Very cool!