If you request a number of rows greater than the dataframe length, it will return the original dataframe instead.

n_samples
[number, integer]
required

Number of rows to sample. How many random rows to pick from the original dataset (without replacement). If the value is greater than 1, it will be interpreted as a count of desired rows. If it is smaller than 1, it will be interpreted as a proportion of the entire dataset.

by
string

Sample independently in these groups. If a column is specified here, the sampling will be applied separately within each group defined by the unique values in this column. Combining this with a count of rows to pick (rather than a proportion), allows this step to balance the dataset, leading to an (approximately) equal number of rows within each group.

seed
[number, null]

A value used to initialize the random number generator, making it deterministic (reproducible).