R. Paul Mihail :: Skyfinder Dataset

Skyfinder

This dataset is for our paper "Sky Segmentation in the Wild: An Empirical Study." In: IEEE Winter Conference on Applications of Computer Vision (WACV). 1–6.

The Skyfinder dataset consists of roughly 90,000 labeled outdoor images captured in a wide range of weather and illumination conditions. To the best of our knowledge, this dataset is the largest in existence with annotated sky pixels and associated weather data. This dataset is motivated by difficulties of existing methods to handle extreme appearance variations of sky regions. The images are taken from AMOS (Archive of Many Outdoor Scenes). We selected 53 cameras that were static (i.e., no camera movement throughout one or more calendar years) and downloaded all the available images for that period, (an average of one year, with around ten thousand image per camera). To keep the dataset size reasonable, we only kept five randomly selected frames for each day. For each camera we manually created a binary mask segmenting sky and ground. The average coverage of sky pixels for all the webcams is 41.19%, with standard deviation 15.71%.

Individual raw images and cameras are now available to download.
Images

Baseline rCNN model
The .zip file contains the following files:

train.net Caffe Model Definition (for training)
deploy.net Caffe Model Definition (for deployment)
solver.prototxt Caffe Solver Configuration
baseline.caffemodel Caffe Model Parameters

Ensemble rCNN model
The .zip file contains the following files:

train.net Caffe Model Definition (for training)
deploy.net Caffe Model Definition (for deployment)
solver.prototxt Caffe Solver Configuration
ensemble.caffemodel Caffe Model Parameters

Metadata
The .csv file contains the following metadata:

weather data for each frame
MCR for all methods discussed in paper
transient attributes for each image