Transforming a dataset in a more compact format (Stata)

Transforming a dataset in a more compact format (Stata) - database

initially I was dealing with a dataset that looked something like this:
+------+--------+-----------+-------+
| date | geo | variables | value |
+------+--------+-----------+-------+
| 1981 | Canada | var1 | # |
| 1982 | Canada | var1 | # |
| 1983 | Canada | var1 | # |
| ... | ... | ... | ... |
| 2015 | Canada | var2 | # |
| 1981 | Canada | var2 | # |
| 1982 | Canada | var2 | # |
| ... | ... | ... | ... |
| 2015 | Canada | var2 | # |
| 1981 | Quebec | var1 | # |
| 1982 | Quebec | var1 | # |
| 1983 | Quebec | var1 | # |
| ... | ... | ... | ... |
| 2015 | Quebec | var2 | # |
| 1981 | Quebec | var2 | # |
| 1982 | Quebec | var2 | # |
| ... | ... | ... | ... |
| 2015 | Quebec | var2 | # |
+------+--------+-----------+-------+
So I have 35 time periods, two countries and two variables. I would like to transform the table in Stata for it to look like this:
+------+--------+------+------+
| date | geo | var1 | var2 |
+------+--------+------+------+
| 1981 | Canada | # | # |
| 1982 | Canada | # | # |
| ... | ... | ... | ... |
| 2015 | Canada | # | # |
| 1981 | Quebec | # | # |
| 1982 | Quebec | # | # |
| ... | ... | ... | ... |
| 2015 | Quebec | # | # |
+------+--------+------+------+
However, I'm not having much success with this. I tried to separate the different observations into variables with the command:
separate value, by(variables) generate(var)
Which creates something like this:
+------+--------+------+------+
| date | geo | var1 | var2 |
+------+--------+------+------+
| 1981 | Canada | # | . |
| 1982 | Canada | # | . |
| ... | ... | ... | ... |
| 2015 | Canada | # | . |
| 1981 | Canada | . | # |
| 1982 | Canada | . | # |
| ... | ... | ... | ... |
| 2015 | Canada | . | # |
| 1981 | Quebec | # | . |
| 1982 | Quebec | # | . |
| ... | ... | ... | ... |
| 2015 | Quebec | # | . |
| 1981 | Quebec | . | # |
| 1982 | Quebec | . | # |
| ... | ... | ... | ... |
| 2015 | Quebec | . | # |
+------+--------+------+------+
Which contains a lot of useless missing values.
So, more specifically, I would like something to bring me directly to Table A to B (i.e. without using separate), or a solution to fix Table C into B.
Thanks a lot.

Without sample data, my answer will have to be untested. I think something like the following will get you started in the right direction.
reshape wide value, i(date geo) j(variables) string
Note that this assumes the contents of your variables variable are suitable for use as variable names. For example, a value of 1potato for variables would be a problem.
In any event,
help reshape
should be your first stop.
Added in response to comment: Here is some data I made up and a demonstration that reshape works for this data. Perhaps you can explain how this data differs from the real data. Your error message suggests that for some combination of date and geo, a particular value of variables occurs more than once.
. list, sepby(geo)
+----------------------------------+
| date geo variab~s value |
|----------------------------------|
1. | 1981 Canada var1 111 |
2. | 1982 Canada var1 211 |
3. | 1983 Canada var1 311 |
4. | 1981 Canada var2 112 |
5. | 1982 Canada var2 212 |
6. | 1983 Canada var2 312 |
|----------------------------------|
7. | 1981 Quebec var1 121 |
8. | 1982 Quebec var1 221 |
9. | 1983 Quebec var1 321 |
10. | 1981 Quebec var2 122 |
11. | 1982 Quebec var2 222 |
12. | 1983 Quebec var2 322 |
+----------------------------------+
. reshape wide value, i(geo date) j(variables) string
(note: j = var1 var2)
Data long -> wide
-----------------------------------------------------------------------------
Number of obs. 12 -> 6
Number of variables 4 -> 4
j variable (2 values) variables -> (dropped)
xij variables:
value -> valuevar1 valuevar2
-----------------------------------------------------------------------------
. rename (value*) (*)
. list, sepby(geo)
+-----------------------------+
| date geo var1 var2 |
|-----------------------------|
1. | 1981 Canada 111 112 |
2. | 1982 Canada 211 212 |
3. | 1983 Canada 311 312 |
|-----------------------------|
4. | 1981 Quebec 121 122 |
5. | 1982 Quebec 221 222 |
6. | 1983 Quebec 321 322 |
+-----------------------------+
.

Related

I can't apply multiple conditions in this SQL statement

I don't know why it doesn't give me the answer for students enrolled in Database Systems but not in Operating System Design.
select student.snum, student.sname, enrolled.cname
-> from enrolled
-> inner join student ON enrolled.snum = student.snum
-> where enrolled.cname="Database Systems" AND enrolled.cname<>"Operating System Design";`
+-----------+--------------------+------------------+
| snum | sname | cname |
+-----------+--------------------+------------------+
| 112348546 | Joseph Thompson | Database Systems |
| 115987938 | Christopher Garcia | Database Systems |
| 348121549 | Paul Hall | Database Systems |
| 322654189 | Lisa Walker | Database Systems |
| 552455318 | Ana Lopez | Database Systems |
+-----------+--------------------+------------------+
My student table.
+-----------+--------------------+------------------------+-------+------+
| snum | sname | major | level | age |
+-----------+--------------------+------------------------+-------+------+
| 51135593 | Maria White | English | SR | 21 |
| 60839453 | Charles Harris | Architecture | SR | 22 |
| 99354543 | Susan Martin | Law | JR | 20 |
| 112348546 | Joseph Thompson | Computer Science | SO | 19 |
| 115987938 | Christopher Garcia | Computer Science | JR | 20 |
| 132977562 | Angela Martinez | History | SR | 20 |
| 269734834 | Thomas Robinson | Psychology | SO | 18 |
| 280158572 | Margaret Clark | Animal Science | FR | 18 |
| 301221823 | Juan Rodriguez | Psychology | JR | 20 |
| 318548912 | Dorthy Lewis | Finance | FR | 18 |
| 320874981 | Daniel Lee | Electrical Engineering | FR | 17 |
| 322654189 | Lisa Walker | Computer Science | SO | 17 |
| 348121549 | Paul Hall | Computer Science | JR | 18 |
| 351565322 | Nancy Allen | Accounting | JR | 19 |
| 451519864 | Mark Young | Finance | FR | 18 |
| 455798411 | Luis Hernandez | Electrical Engineering | FR | 17 |
| 462156489 | Donald King | Mechanical Engineering | SO | 19 |
| 550156548 | George Wright | Education | SR | 21 |
| 552455318 | Ana Lopez | Computer Engineering | SR | 19 |
| 556784565 | Kenneth Hill | Civil Engineering | SR | 21 |
| 567354612 | Karen Scott | Computer Engineering | FR | 18 |
| 573284895 | Steven Green | Kinesiology | SO | 19 |
| 574489456 | Betty Adams | Economics | JR | 20 |
| 578875478 | Edward Baker | Veterinary Medicine | SR | 21 |
+-----------+--------------------+------------------------+-------+------+
My enrolled table
+-----------+----------------------------+
| snum | cname |
+-----------+----------------------------+
| 112348546 | Database Systems |
| 115987938 | Database Systems |
| 348121549 | Database Systems |
| 322654189 | Database Systems |
| 552455318 | Database Systems |
| 455798411 | Operating System Design |
| 552455318 | Operating System Design |
| 567354612 | Operating System Design |
| 112348546 | Operating System Design |
| 115987938 | Operating System Design |
| 322654189 | Operating System Design |
| 567354612 | Data Structures |
| 552455318 | Communication Networks |
| 455798411 | Optical Electronics |
| 301221823 | Perception |
| 301221823 | Social Cognition |
| 301221823 | American Political Parties |
| 556784565 | Air Quality Engineering |
| 99354543 | Patent Law |
| 574489456 | Urban Economics |
+-----------+----------------------------+

You need to use the NOT EXISTS as follows:
select s.snum, s.sname, e.cname
from enrolled e
inner join student s ON e.snum = s.snum
where e.cname='Database Systems'
AND not exists
(select 1 from enrolled ee
where ee.snum = e.snum and e.cname = 'Operating System Design');

Compare string with single quotation mark(' ') than double quotation(" "). Your code seems ok to me.
Remember:
Single quotes are for strings.
Double quotes are for tables names and column names.
select student.snum, student.sname, enrolled.cname
from enrolled
inner join student ON enrolled.snum = student.snum
where enrolled.cname='Database Systems'
AND enrolled.cname<>'Operating System Design';
or try this
select student.snum, student.sname, enrolled.cname
from enrolled
inner join student ON enrolled.snum = student.snum
where enrolled.cname='Database Systems'

Why using char type as index for looping gives unexpected results?

Bear in mind this is an old version of the C compiler: CP/M for Z80.
#include<stdio.h>
main()
{
char i = 0;
do
{
printf("0x%04x | ", i);
} while (++ i);
}
Expected:
0x0000 | 0x0001 | 0x0002 | 0x0003 | 0x0004 | 0x0005 | 0x0006 | 0x0007 | 0x0008 | 0x0009 | 0x000A | 0x000B | 0x000C | 0x000D | 0x000E | 0x000F | 0x0010 | 0x0011 | 0x0012 | 0x0013 | 0x0014 | 0x0015 | 0x0016 | 0x0017 | 0x0018 | 0x0019 | 0x001A | 0x001B | 0x001C | 0x001D | 0x001E | 0x001F | 0x0020 | 0x0021 | 0x0022 | 0x0023 | 0x0024 | 0x0025 | 0x0026 | 0x0027 | 0x0028 | 0x0029 | 0x002A | 0x002B | 0x002C | 0x002D | 0x002E | 0x002F | 0x0030 | 0x0031 | 0x0032 | 0x0033 | 0x0034 | 0x0035 | 0x0036 | 0x0037 | 0x0038 | 0x0039 | 0x003A | 0x003B | 0x003C | 0x003D | 0x003E | 0x003F | 0x0040 | 0x0041 | 0x0042 | 0x0043 | 0x0044 | 0x0045 | 0x0046 | 0x0047 | 0x0048 | 0x0049 | 0x004A | 0x004B | 0x004C | 0x004D | 0x004E | 0x004F | 0x0050 | 0x0051 | 0x0052 | 0x0053 | 0x0054 | 0x0055 | 0x0056 | 0x0057 | 0x0058 | 0x0059 | 0x005A | 0x005B | 0x005C | 0x005D | 0x005E | 0x005F | 0x0060 | 0x0061 | 0x0062 | 0x0063 | 0x0064 | 0x0065 | 0x0066 | 0x0067 | 0x0068 | 0x0069 | 0x006A | 0x006B | 0x006C | 0x006D | 0x006E | 0x006F | 0x0070 | 0x0071 | 0x0072 | 0x0073 | 0x0074 | 0x0075 | 0x0076 | 0x0077 | 0x0078 | 0x0079 | 0x007A | 0x007B | 0x007C | 0x007D | 0x007E | 0x007F | 0x0080 | 0x0081 | 0x0082 | 0x0083 | 0x0084 | 0x0085 | 0x0086 | 0x0087 | 0x0088 | 0x0089 | 0x008A | 0x008B | 0x008C | 0x008D | 0x008E | 0x008F | 0x0090 | 0x0091 | 0x0092 | 0x0093 | 0x0094 | 0x0095 | 0x0096 | 0x0097 | 0x0098 | 0x0099 | 0x009A | 0x009B | 0x009C | 0x009D | 0x009E | 0x009F | 0x00A0 | 0x00A1 | 0x00A2 | 0x00A3 | 0x00A4 | 0x00A5 | 0x00A6 | 0x00A7 | 0x00A8 | 0x00A9 | 0x00AA | 0x00AB | 0x00AC | 0x00AD | 0x00AE | 0x00AF | 0x00B0 | 0x00B1 | 0x00B2 | 0x00B3 | 0x00B4 | 0x00B5 | 0x00B6 | 0x00B7 | 0x00B8 | 0x00B9 | 0x00BA | 0x00BB | 0x00BC | 0x00BD | 0x00BE | 0x00BF | 0x00C0 | 0x00C1 | 0x00C2 | 0x00C3 | 0x00C4 | 0x00C5 | 0x00C6 | 0x00C7 | 0x00C8 | 0x00C9 | 0x00CA | 0x00CB | 0x00CC | 0x00CD | 0x00CE | 0x00CF | 0x00D0 | 0x00D1 | 0x00D2 | 0x00D3 | 0x00D4 | 0x00D5 | 0x00D6 | 0x00D7 | 0x00D8 | 0x00D9 | 0x00DA | 0x00DB | 0x00DC | 0x00DD | 0x00DE | 0x00DF | 0x00E0 | 0x00E1 | 0x00E2 | 0x00E3 | 0x00E4 | 0x00E5 | 0x00E6 | 0x00E7 | 0x00E8 | 0x00E9 | 0x00EA | 0x00EB | 0x00EC | 0x00ED | 0x00EE | 0x00EF | 0x00F0 | 0x00F1 | 0x00F2 | 0x00F3 | 0x00F4 | 0x00F5 | 0x00F6 | 0x00F7 | 0x00F8 | 0x00F9 | 0x00FA | 0x00FB | 0x00FC | 0x00FD | 0x00FE | 0x00FF |
Actual:
0x0A00 | 0x0A01 | 0x0A02 | 0x0A03 | 0x0A04 | 0x0A05 | 0x0A06 | 0x0A07 | 0x0A08 | 0x0A09 | 0x0A0A | 0x0A0B | 0x0A0C | 0x0A0D | 0x0A0E | 0x0A0F | 0x0A10 | 0x0A11 | 0x0A12 | 0x0A13 | 0x0A14 | 0x0A15 | 0x0A16 | 0x0A17 | 0x0A18 | 0x0A19 | 0x0A1A | 0x0A1B | 0x0A1C | 0x0A1D | 0x0A1E | 0x0A1F | 0x0A20 | 0x0A21 | 0x0A22 | 0x0A23 | 0x0A24 | 0x0A25 | 0x0A26 | 0x0A27 | 0x0A28 | 0x0A29 | 0x0A2A | 0x0A2B | 0x0A2C | 0x0A2D | 0x0A2E | 0x0A2F | 0x0A30 | 0x0A31 | 0x0A32 | 0x0A33 | 0x0A34 | 0x0A35 | 0x0A36 | 0x0A37 | 0x0A38 | 0x0A39 | 0x0A3A | 0x0A3B | 0x0A3C | 0x0A3D | 0x0A3E | 0x0A3F | 0x0A40 | 0x0A41 | 0x0A42 | 0x0A43 | 0x0A44 | 0x0A45 | 0x0A46 | 0x0A47 | 0x0A48 | 0x0A49 | 0x0A4A | 0x0A4B | 0x0A4C | 0x0A4D | 0x0A4E | 0x0A4F | 0x0A50 | 0x0A51 | 0x0A52 | 0x0A53 | 0x0A54 | 0x0A55 | 0x0A56 | 0x0A57 | 0x0A58 | 0x0A59 | 0x0A5A | 0x0A5B | 0x0A5C | 0x0A5D | 0x0A5E | 0x0A5F | 0x0A60 | 0x0A61 | 0x0A62 | 0x0A63 | 0x0A64 | 0x0A65 | 0x0A66 | 0x0A67 | 0x0A68 | 0x0A69 | 0x0A6A | 0x0A6B | 0x0A6C | 0x0A6D | 0x0A6E | 0x0A6F | 0x0A70 | 0x0A71 | 0x0A72 | 0x0A73 | 0x0A74 | 0x0A75 | 0x0A76 | 0x0A77 | 0x0A78 | 0x0A79 | 0x0A7A | 0x0A7B | 0x0A7C | 0x0A7D | 0x0A7E | 0x0A7F | 0x0A80 | 0x0A81 | 0x0A82 | 0x0A83 | 0x0A84 | 0x0A85 | 0x0A86 | 0x0A87 | 0x0A88 | 0x0A89 | 0x0A8A | 0x0A8B | 0x0A8C | 0x0A8D | 0x0A8E | 0x0A8F | 0x0A90 | 0x0A91 | 0x0A92 | 0x0A93 | 0x0A94 | 0x0A95 | 0x0A96 | 0x0A97 | 0x0A98 | 0x0A99 | 0x0A9A | 0x0A9B | 0x0A9C | 0x0A9D | 0x0A9E | 0x0A9F | 0x0AA0 | 0x0AA1 | 0x0AA2 | 0x0AA3 | 0x0AA4 | 0x0AA5 | 0x0AA6 | 0x0AA7 | 0x0AA8 | 0x0AA9 | 0x0AAA | 0x0AAB | 0x0AAC | 0x0AAD | 0x0AAE | 0x0AAF | 0x0AB0 | 0x0AB1 | 0x0AB2 | 0x0AB3 | 0x0AB4 | 0x0AB5 | 0x0AB6 | 0x0AB7 | 0x0AB8 | 0x0AB9 | 0x0ABA | 0x0ABB | 0x0ABC | 0x0ABD | 0x0ABE | 0x0ABF | 0x0AC0 | 0x0AC1 | 0x0AC2 | 0x0AC3 | 0x0AC4 | 0x0AC5 | 0x0AC6 | 0x0AC7 | 0x0AC8 | 0x0AC9 | 0x0ACA | 0x0ACB | 0x0ACC | 0x0ACD | 0x0ACE | 0x0ACF | 0x0AD0 | 0x0AD1 | 0x0AD2 | 0x0AD3 | 0x0AD4 | 0x0AD5 | 0x0AD6 | 0x0AD7 | 0x0AD8 | 0x0AD9 | 0x0ADA | 0x0ADB | 0x0ADC | 0x0ADD | 0x0ADE | 0x0ADF | 0x0AE0 | 0x0AE1 | 0x0AE2 | 0x0AE3 | 0x0AE4 | 0x0AE5 | 0x0AE6 | 0x0AE7 | 0x0AE8 | 0x0AE9 | 0x0AEA | 0x0AEB | 0x0AEC | 0x0AED | 0x0AEE | 0x0AEF | 0x0AF0 | 0x0AF1 | 0x0AF2 | 0x0AF3 | 0x0AF4 | 0x0AF5 | 0x0AF6 | 0x0AF7 | 0x0AF8 | 0x0AF9 | 0x0AFA | 0x0AFB | 0x0AFC | 0x0AFD | 0x0AFE | 0x0AFF |
What am I doing wrong?
Assembly:
cseg
?59999:
defb 48,120,37,48,52,120,32,124,32,0
main#:
ld c,0
#0:
push bc
push bc
ld bc,?59999
push bc
ld hl,2
call printf
pop bc
pop bc
pop bc
inc c
jp nz,#0
ret
public main#
extrn printf
end

Golly. LONG time since I used a z80 C compiler, and most were buggy as [unprintable] back then.
I would suggest that you dump the assembler if the compiler allows. My GUESS is that internally the char is being promoted to a 16 bit INT with indeterminate upper bits set.
The problem is that %04X expects an integer - not a char.
You might try forcing the compiler to play nice by explicitly casting the char to an int - i.e.
printf("0x%04x | ", (int) i);

Most probable thing is that, as being an old 8 bit compiler, it is not converting the char typed i variable into an int and it is just pushing the bc register (assuming your function will not use the high part, which is simply not true, as your function (printf()) expects a whole int as parameter) which you don't know what it has in the b register. The compiler is using the full bc register to print, as you use %x format, which is for an int parameter, and this explains the presence of the high byte as 0x0a in the output (and which doesn't appear anywhere in your assembler listing). Later versions of the standard begun to convert every short and char arguments to int in order probably to avoid this kind of issue.
Try this code, and see if that solves the problem.
#include<stdio.h>
main()
{
char i = 0;
do
{
printf("0x%04x | ", (int) i);
} while (++ i);
}
(I cannot check here, as I have z80 computer, but not a C compiler for it)
Edit
After checking the assembler code, the compiler output just pushes the complete bc register into the stack, in which the lower part (thec register) comes from the character you want to print, but the b register was previously loaded with the high byte of the 59999 pointer to the array of characters of the format string, which happens to be 0xea. So, I got stranged at the output, that should be probably 0xea00, 0xea01, 0xea02, ... and not the output you have. Have you recompiled the source to get the assembler output and the output refers to a different compilation?
To dig a little more I'd need the code of the printf() function, which I assume you don't have. But that seems that converting the parameter to (int) before passing it to the printf() function should solve the problem.

Designing a database for categories and subcategories

Basically I'm trying to figure out how Amazon architected their book section. Check out Amazon's book page here (https://www.amazon.com/s/ref=lp_2_ex_n_1?rh=n%3A283155&bbn=283155&ie=UTF8&qid=1522817105).
We are given several main categories: Arts & Photography, Biographies & Memoirs, etc.
If I click on Biographies & Memoirs for example, I'm lead to a series of sub categories. I.E. Biographies & Memoirs > Historical > Asia > Japan
There are repeating sub-category names for example: History > Asia > Japan
How can I map this kind of information so that it is scalable?
Below is the wrong way to do it...?
Categories table
+----+-----------------------+-----------+
| id | name | parent_id |
+----+-----------------------+-----------+
| 1 | Biographies & Memoirs | null |
| 2 | Historical | 1 |
| 3 | Asia | 2 |
| 4 | History | null |
| 5 | Asia | 4 |
| 6 | Japan | 5 |
| 7 | Japan | 3 |
+----+-----------------------+-----------+
Books
+----+-------------------------------------+----------+
| id | name | category |
+----+-------------------------------------+----------+
| 1 | The Lone Samurai | 7 |
| 2 | The Human Tradition in Modern Japan | 7 |
| 3 | Okinawa: The Last Battle | 6 |
+----+-------------------------------------+----------+
Authors
+----+---------------+----------+
| id | firstname | lastname |
+----+---------------+----------+
| 1 | James M. | Burns |
| 2 | Roy E. | Appleman |
| 3 | Russell A. | Gugeler |
| 4 | John | Stevens |
| 5 | William Scott | Wilson |
| 6 | Anne | Walthall |
+----+---------------+----------+
Authors to books (Many to many)
+---------+-----------+
| book_id | author_id |
+---------+-----------+
| 3 | 1 |
| 3 | 2 |
| 3 | 3 |
| 3 | 4 |
| 1 | 5 |
| 2 | 6 |
+---------+-----------+

Multiple outcomes/scenarios

I got a problem that I have already created a solution for, but I'm wondering if there's a better way of solving the problem. Basically I have to create a flag for certain scenarios under a partition of ID and date. My solution involved mapping for all the possible scenarios, then creating "case when" statements for all these scenarios with the specific outcome. Basically, I was the one that created the outcomes. I am wondering if there's another way, something around letting SQL create the outcomes instead of myself.
Thanks a lot!
Background:
+----+-----------+--------+-------+------+-----------------+-----------------------------------------------------------------------------------+
| ID | Month | Status | Value | Flag | Scenario Number | Scenario Description |
+----+-----------+--------+-------+------+-----------------+-----------------------------------------------------------------------------------+
| 1 | 1/01/2016 | First | 123 | No | 1 | First, second and blank exists. Do not flag |
| 1 | 1/01/2016 | Second | 456 | No | 1 | First, second and blank exists. Do not flag |
| 1 | 1/01/2016 | | 789 | No | 1 | First, second and blank exists. Do not flag |
| 1 | 1/02/2016 | Second | 123 | Yes | 2 | First does not exist, two second but have different values. Flag these as Yes |
| 1 | 1/02/2016 | Second | 456 | Yes | 2 | First does not exist, two second but have different values. Flag these as Yes |
| 1 | 1/02/2016 | Second | 123 | No | 3 | First does not exist, two second have same values. Do not flag |
| 1 | 1/02/2016 | Second | 123 | No | 3 | First does not exist, two second have same values. Do not flag |
| 1 | 1/03/2016 | Second | 123 | No | 4 | Only one entry of Second exist. Do no flag |
| 1 | 1/04/2016 | | 123 | Yes | 5 | Two blanks for the partition. Flag these as Yes |
| 1 | 1/04/2016 | | 123 | Yes | 5 | Two blanks for the partition. Flag these as Yes |
| 1 | 1/05/2016 | | | No | 6 | Only one entry of blank exist. Do not flag these |
| 1 | 1/06/2016 | First | 123 | Yes | 7 | First exist for the partition. Do not flag |
| 1 | 1/06/2016 | | 456 | Yes | 7 | First exist for the partition. Do not flag |
| 1 | 1/07/2016 | Second | 123 | Yes | 8 | First does not exist and second and blank do not have the same value. Flag these. |
| 1 | 1/07/2016 | | 456 | Yes | 8 | First does not exist and second and blank do not have the same value. Flag these. |
| 1 | 1/07/2016 | Second | 123 | Yes | 8 | First does not exist and second and blank have the same value. Flag these. |
| 1 | 1/07/2016 | | 123 | Yes | 8 | First does not exist and second and blank have the same value. Flag these. |
+----+-----------+--------+-------+------+-----------------+-----------------------------------------------------------------------------------+
Data:
+----+-----------+-------+----------+---------------+
| ID | Month | Value | Priority | Expected_Flag |
+----+-----------+-------+----------+---------------+
| 1 | 1/01/2016 | 96.01 | | Yes |
| 1 | 1/01/2016 | 96.01 | | Yes |
| 1 | 1/02/2016 | 65.2 | First | No |
| 1 | 1/02/2016 | 3.47 | Second | No |
| 1 | 1/02/2016 | 45.99 | | No |
| 11 | 1/01/2016 | 25 | | No |
| 11 | 1/02/2016 | 74.25 | Second | No |
| 11 | 1/02/2016 | 74.25 | Second | No |
| 11 | 1/02/2016 | 23.25 | | No |
| 24 | 1/01/2016 | 1.25 | First | No |
| 24 | 1/01/2016 | 1.365 | | No |
| 24 | 1/04/2016 | 1.365 | First | No |
| 24 | 1/04/2016 | 1.365 | | No |
| 24 | 1/05/2016 | 1.365 | First | No |
| 24 | 1/05/2016 | 1.365 | First | No |
| 24 | 1/06/2016 | 1.365 | Second | No |
| 24 | 1/06/2016 | 1.365 | Second | No |
| 24 | 1/07/2016 | 1.365 | Second | Yes |
| 24 | 1/07/2016 | 1.365 | | Yes |
| 24 | 1/08/2016 | 1.365 | First | No |
| 24 | 1/08/2016 | 1.365 | | No |
| 24 | 1/09/2016 | 1.365 | Second | No |
| 24 | 1/09/2016 | 1.365 | | No |
| 27 | 1/01/2016 | 0 | Second | Yes |
| 27 | 1/01/2016 | 0 | Second | Yes |
| 27 | 1/02/2016 | 45.25 | Second | No |
| 3 | 1/01/2016 | 96.01 | First | No |
| 3 | 1/01/2016 | 96.01 | First | No |
| 3 | 1/03/2016 | 96.01 | First | No |
| 3 | 1/03/2016 | 96.01 | First | No |
| 35 | 1/01/2016 | | | Yes |
| 35 | 1/01/2016 | | | Yes |
| 35 | 1/02/2016 | | First | No |
| 35 | 1/02/2016 | | Second | No |
| 35 | 1/02/2016 | | | No |
| 35 | 1/02/2016 | | | No |
| 35 | 1/03/2016 | | Second | Yes |
| 35 | 1/03/2016 | | Second | Yes |
| 35 | 1/04/2016 | | Second | No |
| 35 | 1/04/2016 | | Second | No |
+----+-----------+-------+----------+---------------+

How to make a SQL "IF-THEN-ELSE" statement

I've seen other questions about SQL If-then-else stuff, but I'm not seeing how to relate it to what I'm trying to do. I've been using SQL for about a year now but only basic stuff and never this.
If I have a SQL table that looks like this
| Name | Version | Category | Value | Number |
|:-----:|:-------:|:--------:|:-----:|:------:|
| File1 | 1.0 | Time | 123 | 1 |
| File1 | 1.0 | Size | 456 | 1 |
| File1 | 1.0 | Final | 789 | 1 |
| File2 | 1.0 | Time | 312 | 1 |
| File2 | 1.0 | Size | 645 | 1 |
| File2 | 1.0 | Final | 978 | 1 |
| File3 | 1.0 | Time | 741 | 1 |
| File3 | 1.0 | Size | 852 | 1 |
| File3 | 1.0 | Final | 963 | 1 |
| File1 | 1.1 | Time | 369 | 2 |
| File1 | 1.1 | Size | 258 | 2 |
| File1 | 1.1 | Final | 147 | 2 |
| File2 | 1.1 | Time | 741 | 2 |
| File2 | 1.1 | Size | 734 | 2 |
| File2 | 1.1 | Final | 942 | 2 |
| File3 | 1.1 | Time | 997 | 2 |
| File3 | 1.1 | Size | 997 | 2 |
| File3 | 1.1 | Final | 985 | 2 |
How can I write a SQL IF, ELSE statement that creates a new column called "Replication" that follows this rule:
A = B + 1 when x = 1
else
A = B
where A = the number we will use for the next Number
B = Max(Number)
x = Replication count (this is the number of times that a loop is executed. x=i)
The results table will look like this:
| Name | Version | Category | Value | Number | Replication |
|:-----:|:-------:|:--------:|:-----:|:------:|:-----------:|
| File1 | 1.0 | Time | 123 | 1 | 1 |
| File1 | 1.0 | Size | 456 | 1 | 1 |
| File1 | 1.0 | Final | 789 | 1 | 1 |
| File2 | 1.0 | Time | 312 | 1 | 1 |
| File2 | 1.0 | Size | 645 | 1 | 1 |
| File2 | 1.0 | Final | 978 | 1 | 1 |
| File1 | 1.0 | Time | 369 | 1 | 2 |
| File1 | 1.0 | Size | 258 | 1 | 2 |
| File1 | 1.0 | Final | 147 | 1 | 2 |
| File2 | 1.0 | Time | 741 | 1 | 2 |
| File2 | 1.0 | Size | 734 | 1 | 2 |
| File2 | 1.0 | Final | 942 | 1 | 2 |
| File1 | 1.1 | Time | 997 | 2 | 1 |
| File1 | 1.1 | Size | 997 | 2 | 1 |
| File1 | 1.1 | Final | 985 | 2 | 1 |
| File2 | 1.1 | Time | 438 | 2 | 1 |
| File2 | 1.1 | Size | 735 | 2 | 1 |
| File2 | 1.1 | Final | 768 | 2 | 1 |
| File1 | 1.1 | Time | 786 | 2 | 2 |
| File1 | 1.1 | Size | 486 | 2 | 2 |
| File1 | 1.1 | Final | 135 | 2 | 2 |
| File2 | 1.1 | Time | 379 | 2 | 2 |
| File2 | 1.1 | Size | 943 | 2 | 2 |
| File2 | 1.1 | Final | 735 | 2 | 2 |
EDIT: Based on the answer by Sean Lange, this is my 2nd attempt at a solution:
SELECT COALESCE(MAX)(Number) + CASE WHEN Replication = 1 then 1 else 0, 1) FROM Table
The COALESCE is in there for when there is no value yet in the Number column.

The IF/Else construct is used to control flow of statements in t-sql. You want a case expression, which is used to conditionally return values in a column.
https://msdn.microsoft.com/en-us/library/ms181765.aspx
Yours would be something like:
case when x = 1 then A else B end as A

As SeanLange pointed out in this case it would be better to use an CASE/WHEN but to illustrate how to use If\ELSE the way to do it in sql is like this:
if x = 1
BEGIN
---Do something
END
ELSE
BEGIN
--Do something else
END
I would say the best way to know the difference and when to use which is if you are writing a query and want a different field to appear based on a certain condition, use case/when. If a certain condition will cause a series of steps to happen then use if/else

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Transforming a dataset in a more compact format (Stata) - database

Related

I can't apply multiple conditions in this SQL statement

Why using char type as index for looping gives unexpected results?

Designing a database for categories and subcategories

Multiple outcomes/scenarios

How to make a SQL "IF-THEN-ELSE" statement

Categories

Resources