Home>

When writing a scatter plot in Python, I want to display only those that meet the conditions. How should I code it?
The data is read from csv.

For the code below,
⑴ I want to display only data with an X index of 40 or higher.

I want to display only the data after November 4, 2000.

Thank you for teaching me.

import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
% matplotlib inline
df = pd.read_csv ('data4.csv', index_col = 0)
df ['gender'] = df ['gender']. astype ('category')
plt.scatter (df ['height'], df ['weight'], c = df ['gender']. cat.codes)
colors = {'Male': 'blue', 'Female': 'red'}
for f in df ['gender']. unique ():
    plt.scatter (df.loc [df.gender == f, 'height'], df.loc [df.gender == f, 'weight'], c = colors [f], label = f)
plt.legend ()
plt.show ()
  • Answer # 1

    (1)
    Precondition: Create'X index'column

    # Create X exponential sequence
    df ['X exponent'] =<some expression>
    colors = {'Male': 'blue', 'Female': 'red'}
    for f in df ['gender']. unique ():
        selected_df = df.loc [(df.gender == f)&(df.X index>40)]
        plt.scatter (selected_df ['height'], selected_df ['weight'], c = colors [f], label = f)

    (2)
    Prerequisite: Read the date/time column in Datetime format

    # Read the date/time column as Datetime type
    df = pd.read_csv ('data4.csv', index_col = 0, parse_dates = ['datetime'])
    colors = {'Male': 'blue', 'Female': 'red'}
    for f in df ['gender']. unique ():
        selected_df = df.loc [(df.gender == f)&(df.date>= datetime.datetime (2000,11,4))]
        plt.scatter (selected_df ['height'], selected_df ['weight'], c = colors [f], label = f)

    Is it better?

  • Answer # 2

    Specify the range of the graph by entering the value directly:

    set_ylim ([min, max])

    * Average array value acquisition with numpy ⇒ Set the first digit of the acquired value as the maximum value and the last digit as the minimum value