09 Feb, 2008, Guest wrote in the 1st comment:
Votes: 0
I have an application where I need to validate IPv6 addresses to be sure they're not broken. I figured I'd use a slick trick like so:

[php]return( $ip == @inet_ntop(@inet_pton($ip))) ? true : false;[/php]

This works on both IPv4 and IPv6 addresses to validate them. So far so good, right?

I ran into a problem validating something like this:
2001:0db8:0000:0000:0000:0000:1428:57ab

If the user inputs all the zeros, the validation function fails because the return statement compares $ip with 2001:db8::1428:57ab which is another valid way to express the same data. It doesn't match what the user typed though and causes the comparison to fail, telling people they gave an invalid IP.

IPv6 addresses apparently can be expressed according to the following rules:

- A series of "0"s in a 16bit block can by represented by "0".
- A series of blocks containing only "0"s can be suppressed and represented by "::" (this can be done only once)
- Leading zeros in a group can also be omitted (as in ::1 for localhost).

Which means the following are all proper methods of expressing the same address:

2001:0db8:0000:0000:0000:0000:1428:57ab
2001:0db8:0000:0000:0000::1428:57ab
2001:0db8:0:0:0:0:1428:57ab
2001:0db8:0:0::1428:57ab
2001:0db8::1428:57ab
2001:db8::1428:57ab

I'm no regex guru, so I'm sort of stuck on what to do with this :(
09 Feb, 2008, Guest wrote in the 2nd comment:
Votes: 0
Actually, now that I think about it, something to reduce an entered IP to the short form is probably what I need. Then comparing that should work. It's just getting to the reduced version that I'm not sure of.
09 Feb, 2008, kiasyn wrote in the 3rd comment:
Votes: 0
/^([A-Fa-f0-9]{1,4}:){7}[A-Fa-f0-9]{1,4}$|^[A-Fa-f0-9]{1,4}::([A-Fa-f0-9]{1,4}:){0,5}[A-Fa-f0-9]{1,4}$|^([A-Fa-f0-9]{1,4}:){2}:([A-Fa-f0-9]{1,4}:){0,4}[A-Fa-f0-9]{1,4}$|^([A-Fa-f0-9]{1,4}:){3}:([A-Fa-f0-9]{1,4}:){0,3}[A-Fa-f0-9]{1,4}$|^([A-Fa-f0-9]{1,4}:){4}:([A-Fa-f0-9]{1,4}:){0,2}[A-Fa-f0-9]{1,4}$|^([A-Fa-f0-9]{1,4}:){5}:([A-Fa-f0-9]{1,4}:){0,1}[A-Fa-f0-9]{1,4}$|^([A-Fa-f0-9]{1,4}:){6}:[A-Fa-f0-9]{1,4}$/
10 Feb, 2008, David Haley wrote in the 4th comment:
Votes: 0
Here's how to reduce the IPs to their shortened form, which personally I find easier to grok… it's not the most efficient since it requires several passes, but it's much more understandable as a result.

$ cat ips.txt
2001:0db8:0000:0000:0000:0000:1428:57ab
2001:0db8:0000:0000:0000::1428:57ab
2001:0db8:0:0:0:0:1428:57ab
2001:0db8:0:0::1428:57ab
2001:0db8::1428:57ab
2001:db8::1428:57ab
$ cat shortener.pl
#!/usr/bin/perl

while (<STDIN>)
{
my $ip = $_;

# step 1: remove all leading zeroes
s/^0*([A-Fa-f0-9])/$1/g;
s/:0*([A-Fa-f1-9])/:$1/g;

# step 2: collapse blocks of zeroes
s/:+0+(:0+)*:0+:+/::/g;

# all done.

print "$ip \t\t\t–> $_";
}


$ cat ips.txt | ./shortener.pl
2001:0db8:0000:0000:0000:0000:1428:57ab
–> 2001:db8::1428:57ab
2001:0db8:0000:0000:0000::1428:57ab
–> 2001:db8::1428:57ab
2001:0db8:0:0:0:0:1428:57ab
–> 2001:db8::1428:57ab
2001:0db8:0:0::1428:57ab
–> 2001:db8::1428:57ab
2001:0db8::1428:57ab
–> 2001:db8::1428:57ab
2001:db8::1428:57ab
–> 2001:db8::1428:57ab


It should be easy enough to convert this to PHP.

Disclaimer: I haven't tested beyond the example you gave; it's possible it won't always work… I also haven't thought about it a huge amount. UAYOR, YMMV, as they say. :wink:
10 Feb, 2008, Guest wrote in the 5th comment:
Votes: 0
What are the actual regex commands in that perl script? I can get them stuffed into PHP easily enough if I know exactly what the patterns are.

Oh, and for those who think stats are cool, Post# 1000 baby :)
10 Feb, 2008, David Haley wrote in the 6th comment:
Votes: 0
The regex patterns are the things that start with s and go until the semicolon. They're substitutions, so

s/xyz/abc/

means substitute xyz with abc. The 'g' at the end means "do it as many times as possible".

Note that you probably need a variable, like:

$foo =~ (regex)

in Perl if you don't specify one it uses the $_ variable.
10 Feb, 2008, Guest wrote in the 7th comment:
Votes: 0
Something is not quite right, either the patterns are defective or I translated them improperly:
[php]<?php
$ip = '2001:0db8:0000:0000:0000:0000:1428:57ab';

$newip = eregi_replace( "^0*([A-Fa-f0-9])", "$1", $ip );
$newip = eregi_replace( ":0*([A-Fa-f1-9])", ":$1", $newip );
$newip = eregi_replace( ":+0+(:0+)*:0+:+", "::", $newip );

echo $newip . "\n\n";
?>[/php]

Results in:

$1001:$1b8::$1428:$17ab

Edit: Nevermind - apparently these functions expect $1 to be \\1 instead.
12 Feb, 2008, Guest wrote in the 8th comment:
Votes: 0
BTW, in case it comes up and might be of use to someone else later, this is the end result of what I was doing with David's help:
[php] function is_valid_ip($ip, $type = 'A')
{
if( $type == 'A' ) {
if( strpos( $ip, '.' ) === false )
return false;

return( $ip == @inet_ntop(@inet_pton($ip))) ? true : false;
} else if( $type == 'AAAA' ) {
if( strpos( $ip, ':' ) === false )
return false;
if( strpos( $ip, '.' ) !== false )
return false;

$newip = eregi_replace( "^0*([A-Fa-f0-9])", "\\1", $ip );
$newip = eregi_replace( ":0*([A-Fa-f1-9])", ":\\1", $newip );
$newip = eregi_replace( ":+0+(:0+)*:0+:+", "::", $newip );

return( $newip == @inet_ntop(@inet_pton($newip))) ? true : false;
}
return false;
}[/php]
0.0/8