UpSet.js is a JavaScript re-implementation of UpSetR which itself is based on UpSet. The core library is written in React but provides also bundle editions for plain JavaScript use and this Jupyter wrapper.
In this tutorial the basic widget functionality is explained.
Let's begin with importing the widget and some utilities
from ipywidgets import interact
from upsetjs_jupyter_widget import UpSetJSWidget
import pandas as pd
This wrapper is implemented in Python 3 with mypy typings and generics. The generic type T
of the UpSetJSWidget
is type of element that we wanna handle. In the following example we handle str
elements.
w = UpSetJSWidget[str]()
Note: The input data will be described in more detail in the next section
dict_input = {'one': ['a', 'b', 'c', 'e', 'g', 'h', 'k', 'l', 'm'], 'two': ['a', 'b', 'd', 'e', 'j'], 'three': ['a', 'e', 'f', 'g', 'h', 'i', 'j', 'l', 'm']}
w.from_dict(dict_input)
An UpSet plot consists of three areas:
Moving the mouse over a bar or a dot will automatically highlight the corresponding set or set intersection in orange. In addition, the number elements which are shared with the highlighted sets are also highlighted. This gives a quick overview how sets and set intersections are related to each other. More details, in the Interaction section.
In the bottom right corner there are two buttons for exporting the chart in either PNG or SVG image format.
In the current version the UpSet.js wrapper supports three input data formats: dictionary
, expression
and through a Pandas dataframe
.
The first format is a dictionary of type Dict[str, List[T]]
, T
refers again to the elements type, in this case it is a list of str
. The key of the dictionary entry is the set name while the value is the list of elements this set has.
w.from_dict({'one': ['a', 'b', 'c', 'e', 'g', 'h', 'k', 'l', 'm'], 'two': ['a', 'b', 'd', 'e', 'j'], 'three': ['a', 'e', 'f', 'g', 'h', 'i', 'j', 'l', 'm']})
The second format is a mapping of type Dict[str,number]
, i.e., it has to have an .items() -> Iterator[Tuple[str, number]]
method. The key of the dictionary entry is the set combination name while the value is the number of elements in this sets.
w.from_expression({'one': 9, 'two': 5, 'three': 9, 'one&two': 3, 'one&three': 6, 'two&three': 3, 'one&two&three': 2})
The second format is a a binary/boolean data frame. The index column contains the list of elements. Each regular color represents a sets with boolean values (e.g., 0 and 1) whether the row represented by the index value is part of the set or not.
The following data frame defines the same set structure as the dictionary format before.
df = pd.DataFrame(dict(
one=[1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1],
two=[1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
three=[1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1]
), index=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm'])
w.from_dataframe(df)
.elems
returns the list of extracted elements
w.elems
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm']
.sets
returns the list of extracted sets as UpSetSet
objects
w.sets
[UpSetSet(name=one, elems={'m', 'l', 'g', 'c', 'e', 'a', 'h', 'k', 'b'}), UpSetSet(name=three, elems={'m', 'l', 'g', 'i', 'e', 'f', 'a', 'h', 'j'}), UpSetSet(name=two, elems={'e', 'a', 'j', 'd', 'b'})]
Similariy, .combinations
returns the list of set intersections that are visualized as UpSetIntersection
objects.
Note: the attribute is called .combinations
instead of .intersections
since one customized the generation of the set combinations that are visualized. E.g., one can also generate set unions.
w.combinations
[UpSetSetIntersection(name=one, sets={'one'}, elems={'m', 'l', 'g', 'c', 'e', 'a', 'h', 'k', 'b'}), UpSetSetIntersection(name=three, sets={'three'}, elems={'m', 'l', 'g', 'i', 'e', 'f', 'a', 'h', 'j'}), UpSetSetIntersection(name=(one ∩ three), sets={'three', 'one'}, elems={'m', 'e', 'l', 'a', 'g', 'h'}), UpSetSetIntersection(name=two, sets={'two'}, elems={'e', 'a', 'j', 'd', 'b'}), UpSetSetIntersection(name=(one ∩ two), sets={'two', 'one'}, elems={'a', 'e', 'b'}), UpSetSetIntersection(name=(three ∩ two), sets={'three', 'two'}, elems={'a', 'j', 'e'})]
.generate_intersections
, .generate_distinct_intersections
, and .generate_unions
let you customize the generation of the set combinations
min_degree
... minimum number of sets in a set combinationmax_degree
... maximum number of sets in a set combination, None
means no limitempty
... include empty set combinations with no elements. By default they are not includedorder_by
... sort set combinations either by cardinality
(number of elements) or by degree
(number of setslimit
... show only the first limit
set combinationsw.copy().generate_distinct_intersections()
w.copy().generate_intersections(min_degree=2, max_degree=None, empty=True, order_by="cardinality", limit=None)
w.copy().generate_unions(min_degree=0, max_degree=2, empty=True, order_by="degree", limit=None)
UpSet.js allows three intersection modes settable via .mode
'hover'
(default) when the user hovers over a set or set intersection it will be highlighted. This is the default mode'click'
when the user clicks on a set or a set intersection, the selection will be updated'contextMenu'
when the user right clicks on a set or a set intersection, the selection will be updated'static'
disables interactivityw.mode = 'click'
w
with .selection
one manually sets the selection that is currently highlighted. Manually setting the selection is only useful in click
and static
modes.
w.selection = w.sets[0]
w
The current selection is synced with the server. It is designed to work with the interact
of the ipywidgets
package. In the following example the current selected set will be automatically written below the chart and updated interactivly.
w.mode = 'hover'
def selection_changed(s):
return s # s["name"] if s else None
interact(selection_changed, s=w)
<function __main__.selection_changed(s)>
besides the selection UpSet.js supports defining queries. A query can be a list of elements or a set that should be highlighted. A query consists of a name, a color, and either the list of elements or the set (combination) to highlight.
wq = w.copy()
wq.mode = 'static'
wq.selection = None
wq.append_query('Q1', color='red', elems=['a', 'b', 'c'])
wq.append_query('Q1', color='blue', upset=wq.sets[1])
wq
UpSet.js supports rendering boxplots as aggregations for numerical attributes of elements. The are given as part of the data frame. The attributes element can either have a list of column names or a data frame with the same index
df_a = pd.DataFrame(dict(
one=[1, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1],
two=[1, 1, 0, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0],
three=[1, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1],
attr=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
), index=['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm'])
wa = w.copy()
wa.from_dataframe(df_a, attributes=['attr'])
wa
w_dark = w.copy()
w_dark.theme = 'dark'
w_dark
w_label = w.copy()
w_label.title = 'Chart Title'
w_label.description = 'a long chart description'
w_label.set_name = 'Set Label'
w_label.combination_name = 'Combination Label'
w_label
setting .numerical_scale = 'log'
switches to a log scale, similarly 'linear'
goes back to a linear scale
w_log = w.copy()
w_log.numeric_scale = 'log'
w_log
the .width
and .height
properties can be used to specify the width and height of the chart respectively. In general, the .layout
of the Jupyter Widgets can be used to customize it.
w.height = 600
w