Sample Representation in the Social Sciences
The social sciences face a problem of sample nonrepresentation, where the majority of samples consist of undergraduate students from Euro-American institutions. The problem has been identified for decades with little trend of improvement. In this paper, I trace the history of sampling theory. The dominant framework, called the design-based approach, takes random sampling as the gold standard. The idea is that a sampling procedure that is maximally uninformative prevents samplers from introducing arbitrary bias, thus preserving sample representation. I show how this framework, while good in theory, faces many challenges in application. Instead, I advocate for an alternative framework, called the model-based approach to sampling, where representative samples are those balanced in composition, however they were drawn. I argue that the model-based framework is more appropriate in the social sciences because it allows for systematic assessment of imperfect samples and methodical improvement in resource-limited scientific contexts. I end with practical proposals of improving sample quality in the social sciences.
A post-peer-review, pre-copyedit version of this article, published in Synthese, can be found here. The final authenticated version is available online at: http://dx.doi.org/10.1007/s11229-020-02621-3
Statistical Learning Theory and the Problem of Induction
One “easier” form of the problem of induction questions our ability to pick out true regularities in nature, using limited data, with the assumption that such regularities do exist. Harman and Kulkarni (2012) take this problem to be a challenge on our ability to identify precise conditions under which the method of picking hypotheses based on limited datasets is or is not reliable. They identify an influential result from statistical learning theory, hereafter referred to as the VC theorem (Vapnik and Chervonenkis, 2015), which states that, under the condition that the starting hypotheses set has finite VC dimension, the hypothesis chosen from it converges to the true regularity as the size of the dataset goes to infinity.
This result seems to provide us with a condition (i.e., having finite VC dimension) under which a method (i.e., choosing a hypothesis based on its performance over data), is reliable. Indeed, Harman and Kulkarni take this result to be an answer to the form of the problem of induction they have identified. This paper examines this claim. By discussing the details of how VC theorem may be construed as an answer and the connection between VC theorem in statistical learning theory and the NIP property in model theory, I conclude that the VC theorem cannot give us the kind of general answers needed for Harman and Kulkarni’s response to the problem of induction.
A shorter version of the draft that was presented in the 2018 PSA meeting can be found here.
Intuitionistic Probabilism in Epistemology
This paper examines the plausibility of a thesis of probabilism that is based on intuitionistic logic and exposits the difficulties faced by such a program. The paper starts by motivating intuitionistic logic as the logic of investigation along a similar reasoning as Bayesian epistemology. It then considers two existing axiom systems for intuitionistic probability functions — that of Weatherson (2003) and of Roeper and Leblanc (1999) — and discusses the relationship between the two. It will be shown that a natural adaptation of an accuracy argument in the style of Joyce (1998) and de Finetti (1974) to these systems fails. The paper concludes with some philosophical reflections on the results.
The paper has not been published. You can read a draft of it here. It was presented at the 2018 Philosophy of Logic, Mathematics, and Physics Graduate Conference (LMP) at the University of Western Ontario.