The capacity of visual working and visual long-term memory play a critical role in theories of cognitive architecture and the relationship between memory and other cognitive systems. Here, we argue that before asking the question of how capacity varies across different stimuli or what the upper bound of capacity is for a given memory system, it is necessary to establish a methodology that allows a fair comparison between distinct stimulus sets and conditions. One of the most important factors determining performance in a memory task is target/foil dissimilarity. We argue that only by maximizing the dissimilarity of the target and foil in each stimulus set can we provide a fair basis for memory comparisons between stimuli. In the current work we introduce a new way to pick such foils objectively for complex, meaningful real-world objects by using deep convolutional neural networks, and we validate this using both memory tests and similarity metrics. Using this method, we then provide evidence that there is a greater capacity for real-world objects relative to simple colors in visual working memory; critically, we also show that this difference can be reduced or eliminated when non-comparable foils are used, potentially explaining why previous work has not always found such a difference. Our study thus demonstrates that working memory capacity is not fixed capacity but depends critically on the type of information that is remembered, and offers a solution of how to compare memory performance and other cognitive systems across different stimulus sets on common ground.