Why does Speedtest CLI upload fail for US mobile carriers? - mobile

I'm trying to run Speedtest CLI on an embedded Linux device using LTE in the US but the upload fails:
Speedtest by Ookla
Server: North Central Telephone Coop - Lafayette, TN (id = 4895)
ISP: Verizon Wireless
Latency: 49.88 ms (6.93 ms jitter)
Download: 9.23 Mbps (data used: 9.8 MB )
Upload: FAILED
[error] Cannot write:
This works fine if I put the same SIM card in a Wifi access point then connect to it, so it's something to do with the setup of the device. But it works fine in other countries with local carriers.

Turns out the problem is that Linux defaults the MTU to 1500 but US mobile carriers prefer a lower value:
https://www.digi.com/support/knowledge-base/recommended-mtu-mru-settings-on-cellular-networks
Recommended values are:
* AT&T 1420 bytes
* Verizon 1428 bytes
* Others 1430 bytes
(PPP interface default is 1500 bytes)
To ensure our routers work well on all carriers. set the PPP interface local and remote MRU to 1420.
Changing the MTU to 1420 fixed the problem:
ifconfig <Interface_name> mtu <mtu_size> up
Full instructions here: https://linuxhint.com/how-to-change-mtu-size-in-linux/

Related

ALSA ignores configuration

I have an audio box that can be connected via USB to my laptop.
I've written a C application that uses the ALSA API to open a communication channel with this audio box.
The communication should be established at 8kHz, running with a 10ms period size (that is 80 samples).
If I'm connecting the audio box to my laptop and then start the app, it seems that the min period size supported is 170 (e.g. snd_pcm_hw_params_get_period_size_min sets min period size to 170), while snd_pcm_hw_params_set_period_size_near sets the period size to 170.
Looking to /proc/asound/name-of-the-card/stream0, I can see Momentary freq = 48000 Hz (0x30.0000), but the sampling rate requested by me is 8kHz.
Also, snd_pcm_hw_params_set_rate_near call, is not changing the value that I've passed.
By starting the app first and then connecting the audio box to my laptop, snd_pcm_hw_params_get_period_size_min sets the min period size to 16 and when calling snd_pcm_hw_params_set_period_size_near, the period size is set to 80 (which represents what I want to achieve).
Checking again /proc/asound/name-of-the-card/stream0, I can see Momentary freq = 8000 Hz (0x8.0000), that is correct.
I have to mention that my app is trying to open the card associated with the audio box and if the operation doesn't succeed, is retrying the open it every 200ms until succeeds.
My feeling is that in the second case when the period size is set accordingly, my application sets the configuration before the system does (I'm not sure if the system does this).
I've tried to modify defaults.pcm.dmix.rate to 8000 in /usr/share/alsa/alsa.conf, but in this case the period size that is returned by acting as in the first scenario is 1024.
Below are some configurations from /usr/share/alsa/alsa.conf if this helps.
defaults.pcm.minperiodtime 5000 # in us
defaults.pcm.ipc_key 5678293
defaults.pcm.ipc_gid audio
defaults.pcm.ipc_perm 0666
defaults.pcm.dmix.max_periods 0
defaults.pcm.dmix.channels 2
defaults.pcm.dmix.rate 48000
Is there a config file that has a higher priority than what I want to configure via the API?

Synthesizing Packets with Scapy

Today, I was handed a 300 million entry csv file of netflow records, and my objective is to convert the netflow data to synthesized packets by any means necessary. After a bit of researching, I've decided Scapy would be an incredible tool for this process. I've been fiddling with some of the commands and attempting to create accurate packets that depict that netflow data, but I'm struggling and would really appreciate help from someone whose dabbled with Scapy before.
Here is an example entry from my dataset:
1526284499.233,1526284795.166,157.239.11.35,41.75.41.198,443,55915,6,1,24,62,6537,1419,1441,32934,65535,
Below is what each comma separated value represents:
Start Timestamp (Epoch Format): 1526284499.233
End Timestamp (Epoch Format): 1526284795.166
Source IP: 157.239.11.35
Destination IP: 41.75.41.198
IP Header Protocol Number: 443 (HTTPS)
Source Port Number: 55915
Destination Port Number: 6 (TCP)
TOS Value in IP Header: 1 (FIN)
TCP Flags: 24 (ACK & PSH)
Number of Packets: 62
Number of Bytes: 6537
Router Ingress Port: 1419
Router Egress Port: 1441
Source Autonomous System: 32934 (Facebook)
Destination Autonomous System: 65535
My Current Scapy Representation of this Entry:
>>> size = bytes(6537)
>>> packet = IP(src="157.240.11.35", dst="41.75.41.200", chksum=24, tos=1, proto=443) / TCP(sport=55915, dport=6, flags=24) / Raw(size)
packet.show():
###[ IP ]###
version= 4
ihl= None
tos= 0x1
len= None
id= 1
flags=
frag= 0
ttl= 64
proto= 443
chksum= 0x18
src= 157.240.11.35
dst= 41.75.41.200
\options\
###[ TCP ]###
sport= 55915
dport= 6
seq= 0
ack= 0
dataofs= None
reserved= 0
flags= PA
window= 8192
chksum= None
urgptr= 0
options= []
###[ Raw ]###
load= '6537'
My Confusion:
Frankly, I'm not sure if this is right. Where I get confused is that the IP Protocol Header is 443, indicating HTTPS, however the destination port is 6, indicating TCP. Therefore, I'm not sure if I should include TCP or not, or if including the proto IP attribute is gratuitous. Furthermore, I'm not sure if Raw() is the correct way to include the size of each packet, let alone if I defined size in a proper manner.
Please be so kind as to let me know where I've gone wrong, or if I actually miraculously created a perfect synthesized packet for this particular entry. Thank you so much!
I think the columns might be wrong. HTTPS is TCP port 443 (usually), so the protocol number should be 6 (TCP) and one of the ports should be 443. My GUESS is that 443 is the source port, since the source IP belongs to Facebook, making 55915 the destination port. So, I think the columns there go: source IP, dest IP, source port, dest port, protocol.

Load u-boot.ldr in bf548 ezkit using sd card

i am working on BF548 EZKIT LITE, i had done tftp booting in it. Kernel and jffs2 file system loaded successfully and got the root prompt.
But now i need to use SD card for booting, I had made ext2 partition into sd card and copy u-boot.ldr(boot loader) in it, but when try to load this file after inserting SD card into board i had got an error like
tranfering data failed
** ext4fs_devread read error - block
Failed to mount ext2 filesystem...
** Unrecognized filesystem type **
search on net but could not find anything , add log for detail which shows SD card is detected.
bfin> mmcinfo
Device: Blackfin SDH
Manufacturer ID: 3
OEM: 5344
Name: SD02G
Tran Speed: 25000000
Rd Block Len: 512
SD version 2.0
High Capacity: No
Capacity: 1.8 GiB
Bus Width: 4-bit
bfin>
bfin> ext2load mmc 0 0x1000000 u-boot.ldr
tranfering data failed
** ext4fs_devread read error - block
Failed to mount ext2 filesystem...
** Unrecognized filesystem type **
bfin>
I had tried different sd card also but still got the same problem, Any one have clue about this? Please share.
U-boot version= 2014.07.
Linux kernel = 4.5.4
I am using Buildroot for making board support package.
Thank in advance....
Ah, so your problem is:
bfin> ext2load mmc 0 0x1000000 u-boot.ldr
and this should be:
bfin> ext4load mmc 0:1 0x1000000 u-boot.ldr
as you need to specify both the MMC device (0) and the partition on the device (1 as you made 1 partition on the SD card and formatted that). Just saying 0 causes it to try and read the whole device as where the filesystem is which fails when it runs into the partition table. And you need to use 'ext4load' (or just load, if you have the generic commands enabled) as well since you've got ext3/ext4 most likely and not just ext2.

Can't boot basic OpenEmbedded-Core on Freescale i.MX28

I've been trying to build and boot OpenEmbedded-Core on the evaluation kit for Freescale's ARM i.MX28, using the Freescale ARM layer for OpenEmbedded-Core. Unfortunately, I can't find a basic "Getting Started" guide (though there is a Yocto getting-started guide). Unfortunately, I haven't been able to "get started", to the point of successfully booting to a basic command prompt on the board's debug serial port.
Here is what I've been able to piece together, and as far as I've managed to get so far.
Fetch sources
mkdir -p oe-core/freescale-arm
cd oe-core/freescale-arm
git clone git://git.openembedded.org/openembedded-core oe-core
git clone git://github.com/Freescale/meta-fsl-arm.git
cd oe-core
git clone git://git.openembedded.org/meta-openembedded
git clone git://git.openembedded.org/bitbake bitbake
Set up environment
. ./oe-init-build-env
That puts us in a new sub-directory build and sets certain environment variables.
Edit configuration
Edit the conf/bblayers.conf and local.conf files:
conf/bblayers.conf should have the meta-fls-arm and meta-oe layers added for BBLAYERS. E.g.:
BBLAYERS ?= " \
/home/craigm/oe-core/freescale-arm/oe-core/meta \
/home/craigm/oe-core/freescale-arm/oe-core/meta-openembedded/meta-oe \
${TOPDIR}/../../meta-fsl-arm \
"
In conf/local.conf, I set:
BB_NUMBER_THREADS = "4"
PARALLEL_MAKE = "-j 4"
MACHINE = "imx28evk"
Build
bitbake core-image-minimal
I ran this build overnight, and it has completed successfully for me. Output files were in ~/oe-core/freescale-arm/oe-core/build/tmp-eglibc/deploy/images.
There are two boot options that I'd like to try, as described below. Boot from SD card is simpler, but takes quite a long time (~30 min) to write the image to the SD card. Booting from TFTP + NFS is faster, but requires more set-up.
Boot from SD Card
Write image to SD card:
sudo dd if=tmp-eglibc/deploy/images/core-image-minimal-imx28evk.sdcard of=/dev/sdc
It took something like 30 minutes (3.5 GB file). Then I put it in the board's SD card slot 0, and powered-up. It got as far as loading the kernel, then stopped:
U-Boot 2012.04.01-00059-g4e6e824 (Aug 23 2012 - 18:08:54)
Freescale i.MX28 family at 454 MHz
BOOT: SSP SD/MMC #0, 3V3
DRAM: 128 MiB
MMC: MXS MMC: 0
*** Warning - bad CRC, using default environment
In: serial
Out: serial
Err: serial
Net: FEC0, FEC1
Hit any key to stop autoboot: 0
reading boot.scr
** Unable to read "boot.scr" from mmc 0:2 **
reading uImage
2598200 bytes read
Booting from mmc ...
## Booting kernel from Legacy Image at 42000000 ...
Image Name: Linux-2.6.35.3-11.09.01+yocto-20
Created: 2012-08-23 7:53:40 UTC
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 2598136 Bytes = 2.5 MiB
Load Address: 40008000
Entry Point: 40008000
Verifying Checksum ... OK
Loading Kernel Image ... OK
OK
Starting kernel ...
Uncompressing Linux... done, booting the kernel.
Boot from TFTP + NFS
First, I tried to write U-Boot onto an SD card:
sudo dd if=tmp-eglibc/deploy/images/u-boot-imx28evk.mxsboot-sdcard of=/dev/sdc
Then I put it in the board's SD card slot 0, and powered-up. But all I got in the debug serial port was:
0x8020a01d
So, I decided to use Freescale's distribution of U-Boot for i.MX28 (from their LTIB distribution) onto an SD card. I set suitable U-Boot parameters for NFS booting with parameters from DHCP.
setenv bootargs console=ttyAMA0,115200n8
setenv bootargs_nfs setenv bootargs ${bootargs} root=/dev/nfs ip=dhcp nfsroot=,v3,tcp fec_mac=${ethaddr}
saveenv
I connected to a DD-WRT router with the following DNSmasq settings:
dhcp-boot=,,192.168.250.106
dhcp-option=17,"192.168.250.106:/home/craigm/rootfs"
On my host PC, I set up a TFTP server to serve the uImage file from ~/oe-core/freescale-arm/oe-core/build/tmp-eglibc/deploy/images/.
I also set up a root NFS server to serve the root file system. I edited /etc/exports to serve /home/craigm/rootfs. I extracted the root file system:
bitbake meta-ide-support
rm -Rf ~/rootfs
runqemu-extract-sdk tmp-eglibc/deploy/images/core-image-minimal-imx28evk.tar.bz2 ~/rootfs
Then I put the U-Boot SD card in the board's SD card slot 0, and powered-up. It got as far as this, then stopped:
...
TCP cubic registered
NET: Registered protocol family 17
can: controller area network core (rev 20090105 abi 8)
NET: Registered protocol family 29
can: raw protocol (rev 20090105)
mxs-rtc mxs-rtc.0: setting system clock to 1970-01-01 00:03:33 UTC (213)
eth0: Freescale FEC PHY driver [Generic PHY] (mii_bus:phy_addr=0:00, irq=-1)
eth1: Freescale FEC PHY driver [Generic PHY] (mii_bus:phy_addr=0:01, irq=-1)
Sending DHCP requests .
PHY: 0:00 - Link is Up - 100/Full
., OK
IP-Config: Got DHCP answer from 192.168.250.106, my address is 192.168.250.142
IP-Config: Complete:
device=eth0, addr=192.168.250.142, mask=255.255.255.0, gw=192.168.250.1,
host=192.168.250.142, domain=, nis-domain=(none),
bootserver=192.168.250.106, rootserver=192.168.250.106, rootpath=/home/craigm/rootfs
Looking up port of RPC 100003/3 on 192.168.250.106
Looking up port of RPC 100005/3 on 192.168.250.106
VFS: Mounted root (nfs filesystem) on device 0:15.
Freeing init memory: 160K
I'm not sure if it's running without a serial console, or some other problem. I can ping it on 192.168.250.142, but I can't Telnet or SSH to it.
Questions
Is there any sort of "getting started" guide for OpenEmbedded-Core on Freescale's i.MX28?
Is the Freescale ARM layer really intended for use with OpenEmbedded-Core, Yocto, or what? I don't really understand how those projects relate.
Has anyone else had success booting a minimal image of OpenEmbedded-Core on Freescale's i.MX28? If so, how did your procedure differ from mine?
I'm not sure at this stage whether the problem is just a non-functional serial console, or some other sort of issue. It's hard to diagnose these problems that prevent even getting a basic system running. Any pointers on how to diagnose at this point?
Why would the U-Boot be broken so it can't even boot?
From the boot message it looks like U-boot is working fine. U-boot is not broken.
The following boot message
2598200 bytes read
Booting from mmc ...
## Booting kernel from Legacy Image at 42000000 ...
Image Name: Linux-2.6.35.3-11.09.01+yocto-20
Created: 2012-08-23 7:53:40 UTC
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 2598136 Bytes = 2.5 MiB
Load Address: 40008000
Entry Point: 40008000
Verifying Checksum ... OK
Loading Kernel Image ... OK
OK
Starting kernel ...
Uncompressing Linux... done, booting the kernel.
By the above log u-boot has done its job.
"Booting the kernel" is the point where bootloader given the control to the kernel.
I guess the problem might be in kernel image or in memory.
To eliminate the problem of memory, try to check the manual and try to read and write in the RAM using the board's reference manual.
From the boot log RAM looks like has no problem. It has been initialized.
DRAM: 128 MiB
Check if the following message is not causing any problem.
* Warning - bad CRC, using default environment
Check if all the devices have been initialized and nothing is skipped after the bad crc warning.

How can I have DNS name resolving running while other protocols seem to be down?

We are trying to implement a software based on Moxa UC-7112-LX embedded computer (uClinux OS). We use Cinteron MC52i GSM modem (regular GPRS service) and standart pppd to connect to the Internet.
Everything seems to be fine, right after the connection. Ping utility is working, Socket functions in my program work normally too. However after some time ppp connection brokes in a very peculiar way. These are the symptoms of that situation:
When I call ping utility with some host name as parameter the system is able to resolve it's IP and starts sending ICMP packets but gets no response. I am trying different web resources names, so that the system cannot have their addresses cached or something. Whatever I choose, the system correctly resolves IP but can't get any ping responce.
connect() and write() functions in my application give no error return but when it comes to read() the function returns with errno set to ECONNRESET (Connection reset by peer). The program uses standard socket functions (TCP protocol)
the ppp link is shown as running (ifconfig ppp0)
So, the situation that I have is: the link is good enough to maintain DNS resolving service (UDP is working?) but NOT good enough to run TCP connection and receive ping echoes...
The situation does not appear all the time. Sometimes the system can work normally for days without any problem. Whenever the problem appears, simple reset solves everything.
I know that the system we use is quite exotic, and the situation described here may be connected with some buggy tcp stack or pppd implementation. Considering that the system is preconfigured by the manufacturer I don't have any options to rebuild/change the OS firmware.
Still I hope that someone have seen the similar situation on any linux-like system. Is there any way to test why DNS name resolving is working while the other network stuff does not? Is it possible to remove such connection state with some pppd settings?
Edit:
First of all, I'd like to address the possibility of local caching of the IP addresses. I don't have dig utility and I have no idea how to check which host gives the result to getaddrinfo(). Still I'm sure that the addresses are not cached cause I'm trying to ping totally random URLs. Also given the slow GPRS response time it is not necessary to have the time measuring utility to see that ping takes 1-2 seconds or more to resolve IP before starting sending out packets. Furthermore ncsd, BIND or any dns servers do not run locally on the machine. I understand that you may not see that as proof, but that's what I have given the utility set available on my system.
I'd like to give some additional information concerning the internet connection operation.
Normal connection state
The rc script at system load runs another script as background process:
sh /etc/connect &
The connect script is as follows:
#!/bin/sh
echo First connect attempt > /etc/ppp/conn.info
while true
do
date >> /etc/ppp/conn.info
pppd call mts
echo Reconnecting... >> /etc/ppp/conn.info
done
The reason that I've made a loop here is simple: the connection persists for several hours and after that it always breaks. Unfortunately my implementation of pppd does not support the logfile option (so I can't see why is it broken). persist does not seem to work either so I've come to the connect script above. The pppd options are:
/dev/ttyM0 115200 crtscts
connect 'chat -f /etc/ppp/peers/mts.chat'
noauth
user mts
password mts
noipdefault
usepeerdns
defaultroute
ifconfig ppp0 gives:
ppp0 Link encap:Point-Point Protocol
inet addr:172.22.22.109 P-t-P:192.168.254.254 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:34 errors:0 dropped:0 overruns:0 frame:0
TX packets:36 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:3130 (3.0 KiB) TX bytes:2250 (2.1 KiB)
And thats where it starts getting strange. Whenever I connect I'm getting different inet addr but P-t-p is always the same: 192.168.254.254. This is the same address that appears in default gateway entry, as given by netstat -rn:
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
192.168.254.254 0.0.0.0 255.255.255.255 UH 0 0 0 ppp0
192.168.4.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1
192.168.15.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
192.168.0.0 192.168.15.1 255.255.0.0 UG 0 0 0 eth0
0.0.0.0 192.168.254.254 0.0.0.0 UG 0 0 0 ppp0
route -Cevn is unavailable on my system, route gives the same info as above.
But I'm never able to ping the 192.168.254.254, not even when everything is working as intended: tcp connection, ping, DNS etc. Here is the result of traceroute:
traceroute to kernel.org (149.20.4.69), 30 hops max, 40 byte packets
1 172.16.4.210 (172.16.4.210) 528.765 ms 545.269 ms 616.67 ms
2 172.16.4.226 (172.16.4.226) 563.034 ms 526.176 ms 537.07 ms
3 10.250.85.161 (10.250.85.161) 572.805 ms 564.073 ms 556.766 ms
4 172.31.250.9 (172.31.250.9) 556.513 ms 563.383 ms 580.724 ms
5 172.31.250.10 (172.31.250.10) 518.15 ms 526.403 ms 537.574 ms
6 pub2.kernel.org (149.20.4.69) 538.058 ms 514.222 ms 538.575 ms
7 pub2.kernel.org (149.20.4.69) 537.531 ms 538.52 ms 537.556 ms
8 pub2.kernel.org (149.20.4.69) 568.695 ms 523.099 ms 570.983 ms
9 pub2.kernel.org (149.20.4.69) 526.511 ms 534.583 ms 537.994 ms
##### traceroute loops here - why?? #######
So, I can assume that 172.16.4.210 is peer's address. Such address is pingable in any case (see below). I have no idea why the structure of traceroute output is like this (packets come from internal network of ISP right to the destination, 'loop' at the destination address - it just should not be like this).
Also I would like to note that I can ping DNS server but traceroute does not go all the way up to it.
You may notice that there are eth0 and eth1 devices. They are irrelevant to the case. eth1 is not connected and eth0 is connected to lan without internet access.
Bad connection state
So, some time passes and the situation under question appears. I can't ping anything but DNS server (and peer, the address for which I get from traceroute result for the DNS) and cant communicate with remote host via tcp. DNS resolving is working
The network utilites give the same output as in normal state. I have the same unpingable peer (192.168.254.254 from ifconfig result), the routing table is the same:
# ifconfig ppp0
ppp0 Link encap:Point-Point Protocol
inet addr:172.22.22.109 P-t-P:192.168.254.254 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:297 errors:0 dropped:0 overruns:0 frame:0
TX packets:424 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:3
RX bytes:33706 (32.9 KiB) TX bytes:27451 (26.8 KiB)
# route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
192.168.254.254 * 255.255.255.255 UH 0 0 0 ppp0
192.168.4.0 * 255.255.255.0 U 0 0 0 eth1
192.168.15.0 * 255.255.255.0 U 0 0 0 eth0
192.168.0.0 192.168.15.1 255.255.0.0 UG 0 0 0 eth0
default 192.168.254.254 0.0.0.0 UG 0 0 0 ppp0
Note that the original ppp connection (one which I used to provide the output from normal state) persisted. My /etc/connect script did not loop (there was no new record in a makeshift log the script makes).
Here goes the ping to DNS server:
# cat /etc/resolv.conf
#search moxa.com
nameserver 213.87.0.1
nameserver 213.87.1.1
# ping 213.87.0.1
PING 213.87.0.1 (213.87.0.1): 56 data bytes
64 bytes from 213.87.0.1: icmp_seq=0 ttl=59 time=559.8 ms
64 bytes from 213.87.0.1: icmp_seq=1 ttl=59 time=509.9 ms
64 bytes from 213.87.0.1: icmp_seq=2 ttl=59 time=559.8 ms
And traceroute:
# traceroute 213.87.0.1
traceroute to 213.87.0.1 (213.87.0.1), 30 hops max, 40 byte packets
1 172.16.4.210 (172.16.4.210) 542.449 ms 572.858 ms 595.681 ms
2 172.16.4.214 (172.16.4.214) 590.392 ms 565.887 ms 676.919 ms
3 * * *
4 217.8.237.62 (217.8.237.62) 603.1 ms 569.078 ms 553.723 ms
5 * * *
6 * * *
## and so on ###
*** lines may look like trouble but im getting the same traceroute for that DNS in normal situation
ping to 172.16.4.210 works fine as well.
Now to TCP. I've started a simple echo server on my PC and tried to connect via telnet to it (the actual ip address is not shown):
# telnet XXX.XXX.XXX.XXX 9060
Trying XXX.XXX.XXX.XXX(25635)...
Connected to XXX.XXX.XXX.XXX.
Escape character is '^]'.
aaabbbccc
Connection closed by foreign host.
So thats what happened here. Successfull connect() just like in my custom application is followed by Connection closed... when telnet called read(). The actual server did not receive any incoming connection. Why did 'connect()' return normally (it could not get the handshake response from the host!) is beyond my scope of knowledge.
Sure enough same telnet test works fine in normal state.
Note:
I did not publish this on serverfault cause of the embedded nature of my system. serverfault as far as I understand deals with more conventional systems (like x86s running 'normal' linux). I just hope that stackoverflow has more embedded experts who know such systems as my Moxa.
Q: How can I have DNS name resolving running while other protocols seem to be down?
A: Your local DNS resolver (bind is another possibility besides ncsd) might be caching the first response. dig will tell you where you are getting the response from:
[mpenning#Bucksnort ~]$ dig cisco.com
; <<>> DiG 9.6-ESV-R4 <<>> +all cisco.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22106
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 0
;; QUESTION SECTION:
;cisco.com. IN A
;; ANSWER SECTION:
cisco.com. 86367 IN A 198.133.219.25
;; AUTHORITY SECTION:
cisco.com. 86367 IN NS ns2.cisco.com.
cisco.com. 86367 IN NS ns1.cisco.com.
;; Query time: 1 msec <----------------------- 1msec is usually cached
;; SERVER: 127.0.0.1#53(127.0.0.1) <--------------- Answered by localhost
;; WHEN: Wed Dec 7 04:41:21 2011
;; MSG SIZE rcvd: 79
[mpenning#Bucksnort ~]$
If you are getting a very quick (low milliseconds) answer from 127.0.0.1, then it's very likely that you're getting a locally cached answer from a prior query of the same DNS name (and it's quite common for people to use caching DNS resolvers on a ppp connection to reduce connection time, as well as achieving a small load reduction on the ppp link).
If you suspect a cached answer, do a dig on some other DNS name to see whether it can resolve too.
If random DNS names continue resolution and you still cannot make a TCP connection to a certain host, this is worthy of noting when you edit the question after this investigation.
If random DNS names don't resolve, then this is indicative of something like the loss of your default route, or the ppp connection going down.
Other diagnostic information
If you find yourself in either of the last situations I described, you need to do some IP and ppp-level debugs before this can be isolated further. As someone mentioned, tcpdump is quite valuable at this point, but it sounds like you don't have it available.
I assume you are not making a TCP connection to the same IP address of your DNS server. There are many possibilities at this point... If you can still resolve random DNS names, but TCP connections are failing, it is possible that the problem you are seeing is on the other side of the ppp connection, that the kernel routing cache (which holds a little TCP state information like MSS) is getting messed up, you have too much packet loss for tcp, or any number of things.
Let's assume your topology is like this:
10.1.1.2/30 10.1.1.1/30
[ppp0] [pppX]
uCLinux----------------------AccessServer---->[To the reset of the network]
When you initiate your ppp connection, take note of your IP address and the address of your default gateway:
ip link show ppp0 # display the link status of your ppp0 intf (is it up?)
ip addr show ppp0 # display the IP address of your ppp0 interface
ip route show # display your routing table
route -Cevn # display the kernel's routing cache
Similar results can be found if you don't have the iproute2 package as part of your distro (iproute2 provides the ip utility):
ifconfig ppp0 # display link status and addresses on ppp0
netstat -rn # display routing table
route -Cevn # display kernel routing table
For those with the iproute2 utilities (which is almost everybody these days), ifconfig has been deprecated and replaced by the ip commands; however, if you have an older 2.2 or 2.4-based system you may still need to use ifconfig.
Troubleshooting steps:
When you start having the problem, first check whether you can ping the address of pppX on your access server.
If you can not ping the ip address of pppX on the other side, then it is highly unlikely your DNS is getting resolved by anything other than a cached response on your uCLinux machine.
If you can ping pppX, then try to ping the ip address of your TCP peer and the IP address of the DNS (if it is not on localhost). Unless there is a firewall involved, you must be able to ping it successfully for any of this to work.
If you can ping the ip address of pppX but you cannot ping your TCP peer's ip address, check your routing table to see whether your default route is still pointing out ppp0
If your default route points through ppp0, check whether you can still ping the ip address of the default route.
If you can ping your default route and you can ping the remote host that you're trying to connect to, check the kernel's routing cache for the IP address of the remote TCP host.... look for anything odd or suspicious
If you can ping the remote TCP host (and you need to do about 200 pings to be sure... tcp is sensitive to significant packet loss & GPRS is notoriously lossy), try making a successful telnet <remote_host> <remote_port>. If both are successful, then it's time to start looking inside your software for clues.
If you still can't untangle what is happening, please include the output of the aforementioned commands when you come back... as well as how you're starting the ppp connection.
Pings should never be part of an end-user application(see note), and no program should rely on ping to function. At best ping might tell us that a part of the TCP/IP stack was running on the remote. See my argument here.
What the OP describes as a problem doesn't seem to be a problem. All network connections fail, the resolver may or may not use the network, and ping isn't really helpful. I would guess that the OP can check that the modem is connected or not, and if it isn't connect again.
edit: Pseudo code
do until success
try
connect "foobar.com"
try
write data
read response
catch
not success
endtry
catch error
'modem down - reconnect
not success
end try
loop
Note: the exception would be if you are writing a network monitoring application for a networking person.

Resources