Home>
I want to achieve

I am a DB beginner, and I would appreciate it if you could teach me.
From the existing transaction table where the error occurrence time is registered
I want to make a select statement that extracts the error start time and end time.


postgreSQL12
windows10

As a concrete content,

CREATE TABLE tr_machine_error (
 machine VARCHAR (2),
 error_code VARCHAR (3),
 error_date DATETIME
);

machine error_code error_date
A E01 2020/11/17 12:00:00
A E01 2020/11/17 12:10:00
A E01 2020/11/17 12:20:00
A E01 2020/11/17 12:30:00
A E02 2020/11/17 14:20:00
A E02 2020/11/17 14:30:00
A E01 2020/11/17 16:30:00
A E01 2020/11/17 16:40:00
A E01 2020/11/17 16:50:00
B E02 2020/11/17 17:00:00
B E02 2020/11/17 17:10:00
B E02 2020/11/17 17:20:00
A E03 2020/11/17 19:40:00
A E03 2020/11/17 19:50:00
A E03 2020/11/17 20:00:00
A E03 2020/11/17 20:10:00

Assuming that there is a table like
The result I want to output is

machine error_code error_start error_stop
A E01 2020/11/17 12:00:00 2020/11/17 12:30:00
A E02 2020/11/17 14:20:00 2020/11/17 14:30:00
A E01 2020/11/17 16:30:00 2020/11/17 16:50:00
B E02 2020/11/17 17:00:00 2020/11/17 17:20:00
A E03 2020/11/17 19:40:00 2020/11/17 20:10:00

In this way, at the timing when the machine or error code is switched,
I want to output the contents that show the start time and stop time.

What I tried

SELECT
 machine,
 error_code,
 max (error_date),
 min (error_date)
from
 tr_machine_error
group by
 machine, error_code

I wanted to output with max and min, but in the above table, "Machine A and E01" appear twice.

machine error_code error_start error_stop
A E01 2020/11/17 12:00:00 2020/11/17 16:50:00

It will become.

I am sorry that it is difficult to understand, but I would appreciate it if you could teach me.
If i have any questions, please do not hesitate to ask.

  • Answer # 1

    Assuming that error_date is continuous without duplication.

    The breakpoint is when the machine and error_code change from the viewpoint of consecutive error_dates.
    Assign that breakpoint to individual data as group information and aggregate the whole.
    Use the window functions of lag () and lead () to generate delimiter information.
    * If it is a combination of character strings, it may be the same in combination, so change it to array []

    with step1 as (
      select *
           , lag (array [machine, error_code] :: text [], 1, array ['',''])
             over (order by error_date) lag_key
      from tr_machine_error
    ), step2 as (
      select case when array [machine, error_code] :: text []! = lag_key then error_date end error_start
      from step1
    ), step3 as (
      select error_start
      from step2
      where error_start is not null
    ), step4 as (
      select error_start, lead (error_start) over (order by error_start) lead_error_start
      from step3
    )
    select machine, error_code, error_start, max (error_date) as error_end
    from tr_machine_error left join step4
          on tr_machine_error.error_date>= step4.error_start
            and (tr_machine_error.error_date<lead_error_start or lead_error_start is null)
    group by machine, error_code, error_start
    order by error_start


    ・ Explanation
    step1:Use lag () to get the previous machine and error_code.
    step2:Set error_start only where the break is based on the information in step1.
    step3:Only the information for which error_start is set for lead () used in step4.
    step4:Use lead () to create a range to group with error_start.
    Last:Combine with the information in step4 and find error_end by aggregation.

    Since I just came up with it and assembled it, there may be a concise description from a different point of view.

    I'm a DB beginner

    If you are a beginner level, you cannot solve it with SQL alone.
    It should be like reading the table in chronological order and creating start and end.
    If it's a project that throws this all without a guideline, it's a difficult future.