CH-benCHmark
Summary
While standardized and widely used benchmarks address either operational or real-time Business Intelligence (BI) workloads, the lack of a hybrid benchmark led us to the definition of a new, complex, mixed workload benchmark, called mixed workload CH-benCHmark.
This benchmark bridges the gap between the established single-workload suites of TPC-C for OLTP and TPC-H for OLAP, and executes a complex mixed workload: a transactional workload based on the order entry processing of TPC-C and a corresponding TPC-H-equivalent OLAP query suite run in parallel on the same tables in a single database system. As it is derived from these two most widely used TPC benchmarks, the CH-benCHmark produces results highly relevant to both hybrid and classic single-workload systems.
People
|
|
Publications
- Scaling up Mixed Workloads: a Battle of Data Freshness, Flexibility, and Scheduling Proceedings of the 6th TPC Technology Conference on Performance Evaluation and Benchmarking (TPC TC) September 2014Source Code
- The Mixed Workload CH-benCHmark. Proceedings of the 4th International Workshop on Testing Database Systems (DBTest) June 2011
- Metrics for Measuring the Performance of the Mixed Workload CH-benCHmark Proceedings of the 3rd TPC Technology Conference on Performance Evaluation and Benchmarking (TPC TC) August 2011
The Benchmark: Analytical Queries
Query 1
This query reports the total amount and quantity of all shipped orderlines given by a specific time period. Additionally it informs about the average amount and quantity plus the total count of all these orderlines ordered by the individual orderline number.
select ol_number, sum(ol_quantity) as sum_qty, sum(ol_amount) as sum_amount, avg(ol_quantity) as avg_qty, avg(ol_amount) as avg_amount, count(*) as count_order from orderline where ol_delivery_d > '2007-01-02 00:00:00.000000' group by ol_number order by ol_number
Query 2
Query for listing suppliers and their distributed items having the lowest stock level for a certain item and certain region.
select su_suppkey, su_name, n_name, i_id, i_name, su_address, su_phone, su_comment from item, supplier, stock, nation, region, (select s_i_id as m_i_id, min(s_quantity) as m_s_quantity from stock, supplier, nation, region where mod((s_w_id*s_i_id),10000)=su_suppkey and su_nationkey=n_nationkey and n_regionkey=r_regionkey and r_name like 'Europ%' group by s_i_id) m where i_id = s_i_id and mod((s_w_id * s_i_id), 10000) = su_suppkey and su_nationkey = n_nationkey and n_regionkey = r_regionkey and i_data like '%b' and r_name like 'Europ%' and i_id=m_i_id and s_quantity = m_s_quantity order by n_name, su_name, i_id
Query 3
Unshipped orders with the highest price amount for a customer will be listed within a given state and with orders newer than a specific timestamp. This list will be sorted by the descending amount.
select ol_o_id, ol_w_id, ol_d_id, sum(ol_amount) as revenue, o_entry_d from customer, neworder, orders, orderline where c_state like 'A%' and c_id = o_c_id and c_w_id = o_w_id and c_d_id = o_d_id and no_w_id = o_w_id and no_d_id = o_d_id and no_o_id = o_id and ol_w_id = o_w_id and ol_d_id = o_d_id and ol_o_id = o_id and o_entry_d > '2007-01-02 00:00:00.000000' group by ol_o_id, ol_w_id, ol_d_id, o_entry_d order by revenue desc, o_entry_d
Query 4
This query is listing all orders with orderlines or just parts of them shipped after the entry date of their booking.
select o_ol_cnt, count(*) as order_count from orders where o_entry_d >= '2007-01-02 00:00:00.000000' and o_entry_d < '2012-01-02 00:00:00.000000' and exists (select * from orderline where o_id = ol_o_id and o_w_id = ol_w_id and o_d_id = ol_d_id and ol_delivery_d >= o_entry_d) group by o_ol_cnt order by o_ol_cnt
Query 5
Query result for getting information about achieved revenues of nations within a given region. All nations are sorted by the total amount of revenue gained since the given date.
select n_name, sum(ol_amount) as revenue from customer, orders, orderline, stock, supplier, nation, region where c_id = o_c_id and c_w_id = o_w_id and c_d_id = o_d_id and ol_o_id = o_id and ol_w_id = o_w_id and ol_d_id=o_d_id and ol_w_id = s_w_id and ol_i_id = s_i_id and mod((s_w_id * s_i_id),10000) = su_suppkey and ascii(substr(c_state,1,1)) = su_nationkey and su_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'Europe' and o_entry_d >= '2007-01-02 00:00:00.000000' group by n_name order by revenue desc
Query 6
Query lists the total amount of archived revenue from orderlines which were delivered in a specific period and a certain quantity.
select sum(ol_amount) as revenue from orderline where ol_delivery_d >= '1999-01-01 00:00:00.000000' and ol_delivery_d < '2020-01-01 00:00:00.000000' and ol_quantity between 1 and 100000
Query 7
Query for showing the bi-directional trade volume between two given nations sorted by their names and the considered years.
select su_nationkey as supp_nation, substr(c_state,1,1) as cust_nation, extract(year from o_entry_d) as l_year, sum(ol_amount) as revenue from supplier, stock, orderline, orders, customer, nation n1, nation n2 where ol_supply_w_id = s_w_id and ol_i_id = s_i_id and mod((s_w_id * s_i_id), 10000) = su_suppkey and ol_w_id = o_w_id and ol_d_id = o_d_id and ol_o_id = o_id and c_id = o_c_id and c_w_id = o_w_id and c_d_id = o_d_id and su_nationkey = n1.n_nationkey and ascii(substr(c_state,1,1)) = n2.n_nationkey and ( (n1.n_name = 'Germany' and n2.n_name = 'Cambodia') or (n1.n_name = 'Cambodia' and n2.n_name = 'Germany') ) and ol_delivery_d between '2007-01-02 00:00:00.000000' and '2012-01-02 00:00:00.000000' group by su_nationkey, substr(c_state,1,1), extract(year from o_entry_d) order by su_nationkey, cust_nation, l_year
Query 8
This query lists the market share of a given nation for customers from a certain region in which kinds of items are "produced".
select extract(year from o_entry_d) as l_year, sum(case when n2.n_name = 'Germany' then ol_amount else 0 end) / sum(ol_amount) as mkt_share from item, supplier, stock, orderline, orders, customer, nation n1, nation n2, region where i_id = s_i_id and ol_i_id = s_i_id and ol_supply_w_id = s_w_id and mod((s_w_id * s_i_id),10000) = su_suppkey and ol_w_id = o_w_id and ol_d_id = o_d_id and ol_o_id = o_id and c_id = o_c_id and c_w_id = o_w_id and c_d_id = o_d_id and n1.n_nationkey = ascii(substr(c_state,1,1)) and n1.n_regionkey = r_regionkey and ol_i_id < 1000 and r_name = 'Europe' and su_nationkey = n2.n_nationkey and o_entry_d between '2007-01-02 00:00:00.000000' and '2012-01-02 00:00:00.000000' and i_data like '%b' and i_id = ol_i_id group by extract(year from o_entry_d) order by l_year
Query 9
This query describes how much profit has been made on a selection of items for each nation and each year. The result list will be sorted by the name of the nation and the financial year.
select n_name, extract(year from o_entry_d) as l_year, sum(ol_amount) as sum_profit from item, stock, supplier, orderline, orders, nation where ol_i_id = s_i_id and ol_supply_w_id = s_w_id and mod((s_w_id * s_i_id), 10000) = su_suppkey and ol_w_id = o_w_id and ol_d_id = o_d_id and ol_o_id = o_id and ol_i_id = i_id and su_nationkey = n_nationkey and i_data like '%BB' group by n_name, extract(year from o_entry_d) order by n_name, l_year desc
Query 10
Query for analyzing the expenses of all customers listing their living country, some detail of them and the amount of money which they have used to take their orders since a specific date. The whole list is sorted by the amount of the customers’ orders.
select c_id, c_last, sum(ol_amount) as revenue, c_city, c_phone, n_name from customer, orders, orderline, nation where c_id = o_c_id and c_w_id = o_w_id and c_d_id = o_d_id and ol_w_id = o_w_id and ol_d_id = o_d_id and ol_o_id = o_id and o_entry_d >= '2007-01-02 00:00:00.000000' and o_entry_d <= ol_delivery_d and n_nationkey = ascii(substr(c_state,1,1)) group by c_id, c_last, c_city, c_phone, n_name order by revenue desc
Query 11
Most important items (items which are often involved in orders and therefore often bought by customers) supplied by supplier of a given nation.
select s_i_id, sum(s_order_cnt) as ordercount from stock, supplier, nation where mod((s_w_id * s_i_id),10000) = su_suppkey and su_nationkey = n_nationkey and n_name = 'Germany' group by s_i_id having sum(s_order_cnt) > (select sum(s_order_cnt) * .005 from stock, supplier, nation where mod((s_w_id * s_i_id),10000) = su_suppkey and su_nationkey = n_nationkey and n_name = 'Germany') order by ordercount desc
Query 12
This query counts the amount of orders grouped by the number of orderlines in each order attending the number of orders which are shipped with a higher or lower order priority.
select o_ol_cnt, sum(case when o_carrier_id = 1 or o_carrier_id = 2 then 1 else 0 end) as high_line_count, sum(case when o_carrier_id <> 1 and o_carrier_id <> 2 then 1 else 0 end) as low_line_count from orders, orderline where ol_w_id = o_w_id and ol_d_id = o_d_id and ol_o_id = o_id and o_entry_d <= ol_delivery_d and ol_delivery_d < '2020-01-01 00:00:00.000000' group by o_ol_cnt order by o_ol_cnt
Query 13
The query lists the number of customers grouped and sorted by the size of orders they made. The result set of the relation between customers and the size of their orders is sorted by the size of orders and counts how many customers have dealt the same way.
select c_count, count(*) as custdist from (select c_id, count(o_id) from customer left outer join orders on ( c_w_id = o_w_id and c_d_id = o_d_id and c_id = o_c_id and o_carrier_id > 8) group by c_id) as c_orders (c_id, c_count) group by c_count order by custdist desc, c_count desc
Query 14
The query result represents the percentage of the revenue in a period of time which has been realized from promotional campaigns.
select 100.00 * sum(case when i_data like 'PR%' then ol_amount else 0 end) / (1+sum(ol_amount)) as promo_revenue from orderline, item where ol_i_id = i_id and ol_delivery_d >= '2007-01-02 00:00:00.000000' and ol_delivery_d < '2020-01-02 00:00:00.000000'
Query 15
This query finds the top supplier or suppliers who contributed the most to the overall revenue for items shipped during a given period of time.
with revenue (supplier_no, total_revenue) as ( select mod((s_w_id * s_i_id),10000) as supplier_no, sum(ol_amount) as total_revenue from orderline, stock where ol_i_id = s_i_id and ol_supply_w_id = s_w_id and ol_delivery_d >= '2007-01-02 00:00:00.000000' group by mod((s_w_id * s_i_id),10000)) select su_suppkey, su_name, su_address, su_phone, total_revenue from supplier, revenue where su_suppkey = supplier_no and total_revenue = (select max(total_revenue) from revenue) order by su_suppkey
Query 16
This query finds out how many suppliers are able to supply items with given attributes sorted in descending order of them. The result is grouped by the identifier of the item.
select i_name, substr(i_data, 1, 3) as brand, i_price, count(distinct (mod((s_w_id * s_i_id),10000))) as supplier_cnt from stock, item where i_id = s_i_id and i_data not like 'zz%' and (mod((s_w_id * s_i_id),10000) not in (select su_suppkey from supplier where su_comment like '%bad%')) group by i_name, substr(i_data, 1, 3), i_price order by supplier_cnt desc
Query 17
The query determines the yearly loss in revenue if orders just with a quantity of more than the average quantity of all orders in the system would be taken and shipped to customers.
select sum(ol_amount) / 2.0 as avg_yearly from orderline, (select i_id, avg(ol_quantity) as a from item, orderline where i_data like '%b' and ol_i_id = i_id group by i_id) t where ol_i_id = t.i_id and ol_quantity < t.a
Query 18
Query 18 is ranking all customers who have ordered for more than a specific amount of money.
select c_last, c_id o_id, o_entry_d, o_ol_cnt, sum(ol_amount) from customer, orders, orderline where c_id = o_c_id and c_w_id = o_w_id and c_d_id = o_d_id and ol_w_id = o_w_id and ol_d_id = o_d_id and ol_o_id = o_id group by o_id, o_w_id, o_d_id, c_id, c_last, o_entry_d, o_ol_cnt having sum(ol_amount) > 200 order by sum(ol_amount) desc, o_entry_d
Query 19
The query is for reporting the revenue achieved by some specific attributes, as the price, the detailed information of the item and the quantity of the ordered amount of them.
select sum(ol_amount) as revenue from orderline, item where ( ol_i_id = i_id and i_data like '%a' and ol_quantity >= 1 and ol_quantity <= 10 and i_price between 1 and 400000 and ol_w_id in (1,2,3) ) or ( ol_i_id = i_id and i_data like '%b' and ol_quantity >= 1 and ol_quantity <= 10 and i_price between 1 and 400000 and ol_w_id in (1,2,4) ) or ( ol_i_id = i_id and i_data like '%c' and ol_quantity >= 1 and ol_quantity <= 10 and i_price between 1 and 400000 and ol_w_id in (1,5,3) )
Query 20
Suppliers in a particular nation having selected parts that may be candidates for a promotional offer if the quantity of these items is more than 50 percent of the total quantity which has been ordered since a certain date.
select su_name, su_address from supplier, nation where su_suppkey in (select mod(s_i_id * s_w_id, 10000) from stock, orderline where s_i_id in (select i_id from item where i_data like 'co%') and ol_i_id=s_i_id and ol_delivery_d > '2010-05-23 12:00:00' group by s_i_id, s_w_id, s_quantity having 2*s_quantity > sum(ol_quantity)) and su_nationkey = n_nationkey and n_name = 'Germany' order by su_name
Query 21
Query 21 determines the suppliers which have shipped some required items of an order not in a timely manner for a given nation.
select su_name, count(*) as numwait from supplier, orderline l1, orders, stock, nation where ol_o_id = o_id and ol_w_id = o_w_id and ol_d_id = o_d_id and ol_w_id = s_w_id and ol_i_id = s_i_id and mod((s_w_id * s_i_id),10000) = su_suppkey and l1.ol_delivery_d > o_entry_d and not exists (select * from orderline l2 where l2.ol_o_id = l1.ol_o_id and l2.ol_w_id = l1.ol_w_id and l2.ol_d_id = l1.ol_d_id and l2.ol_delivery_d > l1.ol_delivery_d) and su_nationkey = n_nationkey and n_name = 'Germany' group by su_name order by numwait desc, su_name
Query 22
This query lists how many customers within a specific range of country codes have not bought anything for the whole period of time and who have a greater than average balance on their account. The county code is represented by the first two characters of the phone number.
select substr(c_state,1,1) as country, count(*) as numcust, sum(c_balance) as totacctbal from customer where substr(c_phone,1,1) in ('1','2','3','4','5','6','7') and c_balance > (select avg(c_BALANCE) from customer where c_balance > 0.00 and substr(c_phone,1,1) in ('1','2','3','4','5','6','7')) and not exists (select * from orders where o_c_id = c_id and o_w_id = c_w_id and o_d_id = c_d_id) group by substr(c_state,1,1) order by substr(c_state,1,1)